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5 AFFINITY SELECTED SIGNATURE PEPTIDES FOR PROTEIN 

IDENTIFICATION AND QUANTIFICATION 



Statement of Government Rights 
This invention was made with government support under a grant from 
10 the National Institutes of Health, Grant Nos. 25431 and GM 59996. The U.S. 

Government has certain rights in this invention. 



. Background of the Invention 

DNA sequencing of the human genome has profoundly advanced our 

1 5 understanding of the molecular anatomy of mammalian cells. However, 

knowing the sequence of all the genes in a cell and extrapolating from this the 
probable products a cell is capable of producing is not enough. It is clear that i) 
not all genes are expressed to the same degree; ii) the DNA sequence does not 
always tell you the structure of a protein in the cases of post-transcriptional and 

20 post-translational modifications; iii) knowing the sequence of a gene tells you 

nothing about the control of expression; iv) control of genetic expression is 
extremely complicated and can vary from protein to protein; v) post- 
translational modification can occur without de novo protein biosynthesis; and 
vi) variables other than genomic DNA can be responsible for disease. 

25 In addition, it has recently become apparent that there is a poor 

correlation between genetic expression of mRNA, generally measured as cDNA, 
and the amount of protein expressed by that mRNA. Changes in mRNA 
concentration are not necessarily proportional to changes in protein 
concentration. There are even many cases where mRNA will be up regulated 

30 and protein concentration will not change at all. The steady state concentration 

of a protein can depend on the relative degree of expression from multiple genes 
and the activity of these gene products in the synthesis of a specific protein. 
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Glycoproteins provide a good example. The concentration of a glycoprotein can 
depend on the level to which the gene coding for the polypeptide backbone is 
regulated, the presence of all the enzymes responsible for the synthesis and 
attachment of the oligosaccharide to the polypeptide, and the concentration of 
5 glycosidases and proteases that degrade the glycoprotein. For these reasons, 

analysis of regulation using messenger RNA-based techniques such as "DNA 
chips" alone is inadequate. It is clear that measuring the concentration of 
mRNA that codes for the polypeptide backbone may either distort or fail to 
recognize the total picture of how a protein is regulated. In cases where it is 

10 desirable to know how protein expression levels change, direct measurement of 

those levels may be needed. 

Concentration and expression levels of specific proteins vary widely in 
cells during the life cycle, both in absolute concentration and amount relative to 
other proteins. Over- or under-expression are known to be indicators of genetic 

15 errors, faulty regulation, disease, or a response to drugs. However, the small 

number of proteins that are up- or down-regulated in response to a particular 
stimulus are difficult to recognize with current technology. Further, it is 
frequently difficult to predict which proteins are subject to regulation. The need 
to examine 20,000 proteins in a cell to find the small number in regulator flux is 

20 a formidable problem. The ability to detect only the small numbers of up- or 

down-regulated proteins in a complex protein milieu would substantially 
enhance the value of proteomics. 

Proteins in complex mixtures are generally detected by some type of 
fractionation or immunological assay technique. The advantages of 

25 im m unological assay methods are their sensitivity, specificity for certain 

structural features of antigens, low cost, and simplicity of execution. 
Immunological assays are generally restricted to the determination of single 
protein analytes. This means it is necessary to conduct multiple assays when it 
is necessary to determine small numbers of proteins in a sample. Hormone- 

30 receptor association, enzyme-inhibitor binding, DNA-protein binding and 

lectin-glycoprotein association are other types of bioaffinity that have been 
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exploited in protein identification, but are not as widely used as 
irnmunorecognition. Although not biospecific, immobilized metal affinity 
chromatography (IMAC) is yet another affinity method that recognizes a 
specific structural element of polypeptides (J. Porath et al., Nature 258: 598-599 
5 (1992)). 

The fractionation approach to protein identification in mixtures is often 
more lengthy because analytes must be purified sufficiently to allow a detector 
to recognize specific features of the protein. Properties ranging from chemical 
reactivity to spectral characteristics and molecular mass have been exploited for 

10 detection. Higher degrees of purification are required to eliminate interfering 

substances as the detection mode becomes less specific. Since no single 
purification mode can resolve thousands of proteins, multidimensional 
fractionation procedures must be used with complex mixtures. Ideally, the 
various separation modes constituting the multidimensional method should be 

15 orthogonal in selectivity. The two-dimensional (2D) gel electrophoresis method 

of O'Farrel (7. Biol Chem. 250:4007-4021 (1975)) is a good example. The first 
dimension exploits isoelectric focusing while the second is based on molecular 
size discrimination. At the limit, 6000 or more proteins can be resolved. 2D gel 
electrophoresis is now widely used in proteomics where it is the objective to 

20 identify thousands of proteins in complex biological extracts. 

The most definitive way to identify proteins in gels is by mass spectral 
analysis of peptides obtained from a tryptic digest of the excised spot. 
Digestion of an excised spot with trypsin typically generates about 30-200 
peptides. Identification is greatly facilitated when peptide molecular mass can 

25 be correlated with tryptic cleavage fragments predicted from a genomic 

database. Computer-assisted mathematical deconvolution algorithms are used 
to identify a protein based upon its "composite peptide signature." Proteins can 
also be identified by their separation characteristics alone in some cases. The 
advantage of 2D electrophoresis followed by tryptic mapping is that large 

30 numbers of proteins can be identified simultaneously. However, the 

disadvantages of the technique are (1) it is very slow and requires a large 
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number of either manual or robotic manipulations, (2) charged isoforms are 
resolved whereas uncharged variants in which no new charge is introduced are 
not, (3) proteins must be soluble to be examined, and (4) quantification by 
staining is poor. 

5 In addition to being used to identify proteins, 2-D gel electrophoresis has 

also been used to assess relative changes in protein levels. The degree to which 
the concentration of a protein changes can be determined by staining the gel and 
visually observing those spots that changed. Alternatively, changes in the 
concentration of a protein can be quantitated with a gel scanner. A control 2-D 

10 gel is required to determine the concentration of the protein before it was either 

up or down regulated. Tryptic cleavage of the excised spot and mass analysis 
using mass spectrometry remains necessary to identify the protein whose 
expression level has changed. 

Promising new techniques are emerging that replace 2-D gel 

1 5 electrophoresis. Most involve some combination of high performance liquid 

chromatography (HPLC) or capillary electrophoresis (CE) with mass 
spectrometry to either create a "virtual 2-D gel" or go directly to the peptide 
level of analysis by tryptic digesting all the proteins in samples as the initial 
step of analysis. The use of multidimensional chromatography (MDC) to 

20 identify proteins in a complex mixture is faster, easier to automate, and couples 

more readily to MS than 2D gel electrophoresis. One of the more attractive 
features of chromatographic systems is that they allow many dimensions of 
analysis to be coupled by analyte transfer between dimensions through 
automated valve switching. A recent report of an integrated six dimensional 

25 analytical system in which serum hemoglobin was purified and sequenced 

automatically in <2 hours is an example (F. Hsieh et al., Anal. Chem . 68:455 
(1996)). Subsequent to purification on an immunoaffinity column, hemoglobin 
was desorbed into an ion-exchange column for buffer exchange and then tryptic 
digested by passage through an immobilized trypsin column. Peptides eluting 

30 from the imm obilized enzyme column were concentrated and desalted on a 

small, low-surface-area reversed-phase liquid chromatography (RPLC) column 
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and then transferred to an analytical RPLC column where they were separated 
and introduced into a mass spectrometer through an electrospray interface. 
Identification at the primary structure level was achieved by a combination of 
chromatographic properties and multidimensional mass spectrometry of the 
5 tryptic peptides. The ability of the immunosorbant to rapidly select the desired 

analyte for analysis was a great asset to this analysis. Size-exclusion or ion- 
exchange chromatography coupled to reversed-phase chromatography are other 
examples of multidimensional systems, albeit of lower selectivity than those 
using immunosorbant. 

10 Although the methods described above are highly selective and widely 

used, they have some attributes that limit their efficacy. One is the need for 
proteins to be soluble before than can be analyzed. This can be a serious 
limitation in the case of membrane and structural proteins that are sparingly 
soluble. A second is that it is desirable or even necessary in some cases for the 

15 protein analyte to be of native structure during at least part of the analysis. This 

is a limitation because it restricts the sample preparation protocol. Native 
macromolecular structures are notoriously more difficult to analyze than small 
molecules. The necessity for post separation proteolysis, as in the 2D gel 
approach, is another limitation. Large numbers of fractions must be subjected to 

20 a 24 hour tryptic digestion protocol in the analysis of a single sample when 

many proteins are being identified. The tryptic digestion step is necessary 
because the mass of intact proteins is far less useful in searching DNA databases 
than that of peptides derived from the protein. And finally, pure proteins are a 
prerequisite for antibody preparation in all the immunorecognition methods. 

25 The preparation of antibodies to an antigen is lengthy, laborious, and costly, and 

many antigens have never been purified. This is particularly true of proteins 
predicted by genomic data alone. Purification is complicated by the fact that 
one does not know the degree to which a protein is expressed, whether it is part 
of a multisubunit complex, or if it is post translationally modified. 

30 Additionally, there is the issue of quantification. Measuring either the 

relative abundance of proteins or changes in protein concentration remains a 
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major challenge in proteomics. Improved methods for protein identification, 
quantification and detection of regulatory (or relative change) or proteins, 
especially for the identification and quantification of proteins within a complex 
mixture, are clearly needed to advance the new science of proteomics. 



Summar y of the Invention 
The present invention provides a method for protein identification and 
quantification in complex mixtures that utilizes affinity selection of constituent 
peptide fragments. These peptides function as analytical surrogates for the 

10 proteins. The method of the invention makes it possible to identify a protein in 

a sample, preferably a complex sample, without sequencing the entire protein. 
In many cases the method allows for identification of a protein in a sample 
without sequencing any part of the protein. 

To "identify a protein" as that phrase is used herein means to determine 

15 the identity of a protein or a class of proteins to which it belongs. Identifying a 

protein within a complex mixture of proteins involves detennining the presence 
or absence of a particular protein or class of proteins in the mixture. Prior to 
identifying the protein according to the method of the invention, it may be 
suspected that a particular protein is in the mixture. On the other hand, the 

20 protein content of the mixture may be largely unknown. Protein identification 

according to the method may be used, for example, to catalog the contents of a 
complex mixture or to discover heretofore unknown proteins (e.g., proteins that 
are predicted from the genome but have not yet been isolated). 

Proteolysis of most proteins yields at least one unique "signature 

25 peptide." The method of the invention identifies these constituent signature 

peptides, preferably utilizing mass spectrometry, thereby allowing the protein 
comprising the signature peptide to be distinguished from all other proteins in a 
complex mixture and identified. 

Constituent peptides can provide a generic signature for proteins as well, 

30 especially when major portions of the amino acid sequence of a series of protein 

variants are homologous. Glycoprotein variants that differ in degree of 
glycosylation but not amino acid sequence are an example. Proteins that have 
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been modified by proteolysis are another case. Peptides that are unique to a 
variety of species of similar structure are defined as "generic signature 
peptides", and the invention thus allows identification of a class of proteins by 
detecting and characterizing their generic signature peptides. 
5 Proteins in a sample are initially fragmented, either as part of the 

method or hi advance of applying the method. Fragmentation in solution can be 
achieved using any desired method, such as by using chemical, enzymatic, or 
physical means. It should be understood that as used herein, the terms 
"cleavage", "proteolytic cleavage", "proteolysis", "fragmentation" and the like 

10 are used interchangeably and refer to scission of a chemical bond within 

peptides or proteins in solution to produce peptide or protein "fragments" or 
"cleavage fragments." No particular method of bond scission is intended or 
implied by the use of these terms. Fragmentation and the formation of peptide 
cleavage fragments in solution are to be differentiated from similar processes in 

15 me gas phase within a mass spectrometer. These terms are context specific and 

relate to whether bond scission is occurring in solution or the gas phase in a 
mass spectrometer. 

Prior to proteolytic cleavage, the proteins are preferably alkylated with 
an alkylating agent in order to prevent the formation of dimers or other adducts 

20 through disulfide/dithiol exchange; optionally, the proteins are reduced prior to 

alkylation in order to facilitate the alkylation reaction and subsequent 
fragmentation. Some proteins are resistant to proteolysis unless they have been 
reduced and alkylated prior to cleavage. 

At least one peptide derived from the protein to be identified preferably 

25 includes at least one affinity ligand. The affinity ligand can be endogenous or 

exogenous. Preferably, the affinity ligand is endogeneous, thereby simplifying 
the method. If exogenous, the method optionally includes covalently attaching 
at least one affinity ligand to at least one protein (or peptide) in the sample 
before (or after) proteolytic cleavage. Optionally, the affinity ligand is 

30 covalently linked to the alkylating agent. The peptides are then contacted with 

a capture moiety to select peptides that contain the at least one affmity ligand. 
If desired, a plurality of affinity ligands are attached, each to at least one protein 
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or peptide, and the peptides are contacted with a plurality of capture moieties to 
select peptides that contain at least one affinity ligahd. Optionally, the selected 
peptides are fractionated at this point in order to further simplify the mixture 
and make it amenable to mass spectrometric analysis, yielding a plurality of 
5 peptide fractions. 

Peptides are analyzed by mass spectrometry to detect at least one 
peptide derived from the protein to be identified, thereby permitting 
identification of the protein(s) from which the detected peptide was derived. 
When the detected peptide is a signature peptide, the method further includes 

10 dete rminin g the mass of the signature peptide and using the mass of the 

signature peptide to identify the protein from which the detected peptide was 
derived. Optionally, the amino acid sequence of all or a portion of a detected 
peptide can be detennined and used to identify the protein from which the 
detected peptide was derived. In a preferred embodiment, the mass of the 

1 5 signature peptide is compared with the masses of reference peptides derived 

from putative fragmentation of a plurality of reference proteins in a database, 
wherein the masses of the reference peptides are adjusted to include the mass of 
the affinity ligand, if necessary. Prior to making this comparison, reference 
peptides are optionally computationally selected to exclude those that do not 

20 contain an amino acid upon which the affinity selection is based in order to 

simplify the databases comparison. 

The advantages of the method for protein identification of the invention 
are numerous. Proteins themselves (which are large molecules compared to 
peptides) do not need to be separated electrophoretically or 

25 chromatographically, both time consuming steps. Moreover, affinity selection 

yields a subpopulation of peptides (typically ehminating about 90% of peptides) 
that is, advantageously, enriched for "signature peptides." If desired, multiple 
selections can be used to produce the enriched, affinity-selected population, 
further simplifying the process of protein identification. In many cases, a 

30 protein can be identified from its signature peptides; it is not necessary to purify 

the protein, sequence any part of it, or determine its composite peptide signature 
in order to identify it. 
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The present invention further provides a post-synthetic isotope labeling 
method useful for detecting differences in the concentration of metabolites 
between two samples. Application of the isotope labeling method of the 
invention is not limited to proteins, but can be used to identify or quantitate 
5 other metabolites as well such as lipids, nucleic acids, polysaccharides, 

glycopeptides, glycoproteins, and the like. The samples are preferably complex 
mixtures, and the metabolite is preferably a protein or a peptide. 
Advantageously, the method can be utilized with complex mixtures from 
various biological environments. For example, the method of the invention can 

10 be used to detect a protein or family of proteins that are in regulatory flux in 

response to the application of a stimulus. Peptides derived from these proteins 
exhibit substantially the same isotope ratios, which differ from the normalized 
isotope ratio determined for proteins that are not in flux, indicating that they are 
co-regulated. Or, samples can be obtained from different organisms, cells, 

15 organs, tissues or bodily fluids, in which case the method permits determination 

of the differences in concentration of at least one protein in the organisms, cells, 
organs, tissues or bodily fluids from which the samples were obtained. 

The post-synthetic isotope labeling method of the invention involves 
attaching a first chemical moiety to a protein, peptide, or the cleavage products 

20 of a protein in a first sample and a second chemical moiety to a protein, peptide, 

or the cleavage products of a protein in a second sample to yield first and second 
isotopically labeled proteins, peptides or protein cleavage products, respectively, 
that are chemically equivalent yet isotopically distinct. The chemical moiety 
can be a single atom (e.g., oxygen) or a group of atoms (e.g., an acetyl group). 

25 The labeled proteins, peptides or peptide cleavage products are isotopically 

distinct because they contain different isotopic variants of the same chemical 
entity (e.g, a peptide in the first sample contains 1 H where the peptide in the 
second sample contains 2 H; or a peptide in the first sample contains 12 C where 
the peptide in the second sample contains 13 C). 

30 When a complex protein mixture is being analyzed, isotopic labeling can 

be performed either before or after cleavage of the proteins. Preferably, 
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isotopic labeling is performed after cleavage, and the first and second chemical 
moieties are attached to at least one amino group, preferably the N-terminus, 
and/or at least one carboxylic acid group, preferably the C-terminus, on the 
peptides. Conveniently, the N-termini of proteins or peptides can be labeled in 
an acetylation reaction, and/or the C-termini of proteins or peptides can be 
labeled by incorporation of ls O from H 2 ls O in the hydrolysis reaction. In the 
latter case, one chemical moiety is represented by ie O, the naturally occurring 
isotope, and the other chemical moiety is represented by ls O; in effect, this 
particular process can be considered as "isotopically labeling" only one of the 
samples (the one that carries the ls O isotope). When both the N-termini and the 
C-termini of proteins or peptides are isotopically labeled, it is possible to 
differentiate between C-terminal peptides, N-terminal blocked peptides, and 
those that are internal. Labeling both the N - and C- terminus of the proteins or 
peptides also facilitates the analysis of single amino acid polymorphisms. 
Labeling at the N - and/or C- terminus allows all or substantially all proteolytic 
peptides to be labeled, the advantages of which are discussed below. 

At least a portion of each sample is typically mixed together to yield a 
combined sample, which is subjected to mass spectrometric analysis. Control 
and experimental samples are mixed after labeling, fractions containing the 
desired components are selected from the mixture, and concentration ratio is 
determined to identify analytes that have changed in concentration between the 
two samples. However, actual mixing of the samples is not required, and the 
mass spectrometric analysis can be carried out on each sample independently, 
then analyzed with the assistance of a computer to achieve the same end. This 
important feature of the method significantly reduces processing time and 
facilitates automation of the process. 

The members of at least one pair of chemically equivalent, isotopically 
distinct peptides optionally include at least one affinity ligand. The affinity 
ligand can be endogenous or exogenous. If exogenous, the method optionally 
includes covalently attaching at least one affinity ligand to at least one protein 
(or peptide) in the sample before (or after) proteolytic cleavage. Optionally, the 
affinity ligand is covalently linked to the alkylating agent. Prior to determining 
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the isotope ratios, the peptides are contacted with a capture moiety to select 
peptides which contain the at least one affinity ligand. If desired, a plurality of 
affinity ligands can be attached, each to at least one protein or peptide, and the 
peptides are contacted with a plurality of capture moieties to select peptides that 
5 contain at least one affinity ligand. In a preferred embodiment, at least one 

"signature peptide" unique to a protein is selected, and the signature peptide is 
subsequently used to identify the protein from which it was derived. 

In a preferred embodiment, the affinity ligand is distinct from the isotope 
labeling moieties. In other words, the labeling step is not coupled to the 

1 0 selection step. This allows the quantitation function and the selection function 

to be independent of one another, permitting more freedom in the choice of 
reagents and labeling sites and also allowing an isotopically labeled sample to 
be assayed for different signature peptides. Another advantage of uncoupling 
the labeling and selection steps is that labeling, if performed after cleavage, can 

15 be applied in a manner to label all peptides, not just the peptide to be selected. 

When the method involves labeling all peptide fragments, it is referred 
to herein as the global mternal .standard technology (GIST) method (Fig. 1). 
Components from control samples function as standards against which the 
concentration of components in experimental samples are compared. When the 

20 differential labeling process is directed at primary amine, carboxyl groups, or 

both in peptides produced during proteolysis of the proteome, an internal 
standard is created for essentially every peptide in the mixture. Possible, but 
rare, exceptions to this include peptides that are derivatized or blocked on the 
N-terminus or C-terminus. Examples of N-terminal blocking include f-met 

25 proteins found in bacterial systems, acylation of serum proteins, and the 

formation of the cyclic moiety pyrrolidone carboxylic acid (pyroGlu or pGlu) at 
an N-terminal glutamate. The C-terminus can be blocked due to the formation 
of an amide or an ester; for example many prenylated proteins are blocked at 
the C-terminus with a methyl ester. In any event, because virtually all peptide 

30 fragments in the sample are labeled, the method is referred to as a global 

labeling strategy. This global internal standard technology (GIST) for labeling 
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may be used to quantifying the relative concentration of all components in 
complex mixtures. 

As an example, an investigator can isotopically label all peptides (by 
labeling the free amino group or the free carboxyl group that characterizes 
5 nearly every peptide), then independently affinity label the isotopically labeled 

peptides at other sites, either in parallel or in series. Perhaps tyrosines in an 
aliquot of a globally isotopically labeled peptide pool could be affinity labeled 

i (either before or after protein fragmentation), after which peptides containing 

tyrosines could be selected. Then, another aliquot of the same peptide pool 

10 could be selected for histidine-containing peptides. Alternatively, the selected 

tyrosine-containing peptide subpopulation could be further selected for 
histidine, depending on the interests of the investigator. Isotope ratios for any 
of these selected peptides could be determined using mass spectrometry. See 
Example V for examples of multiple selections on globally isotopically labeled 

15 peptides. 

Although the advantages of keeping the isotopic labeling step 
independent of the selection criteria are significant and very clear, it should 
nonetheless be understood that, if desired, the affinity ligand and the first and 
second moieties used to isotopically label the peptides or proteins can be the 

20 same, as in the case where proteins or peptide are affinity labeled at cysteine 

with isotopically distinct forms of the alkylating agent, iodoacetic acid, coupled 
to the affinity ligand biotin. It is significant that if cysteine-containing peptides 
are to be selected, the investigator is generally limited to derivatizing the protein 
prior to cleavage, as part of the reduction and alkylation process. In addition, it 

25 should be cautioned that whenever isotopically labeling is coupled to the 

selection process, only a subpopulation of the peptide fragments will be 
isotopically labeled. Moreover, only one selection criterion can be effectively 
used for comparative quantitative analysis of peptides. Application of a second 
selection criterion selects for peptides that are not necessarily isotopically 

30 labeled, rendering quantitative comparison impossible. If a second selection is 
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desired, the protein or peptide sample must be isotopically labeled a second time 
with the new derivatizing agent. 

Furthermore, unless peptides are globally labeled isotopically, it is not 
possible to select and quantitatively compare peptides on the basis of an 
5 inherent feature of the peptide (i.e., an endogenous affinity ligand). For 

example, tyrosinephosphate-containing peptides selected using 
immunochromatrography, or Mstidine-containing peptides selected using IMAC 
(see below) could not be quantitatively compared unless a global isotopic 
labeling strategy was used. Selection using an endogenous affinity ligand (as 
10 opposed to an exogenous ligand that needs to be linked to the peptide in a 

separate step) is preferred in the method of the invention, therefore the ability to 
globally label the peptides is an extremely important and useful aspect of the 
invention. 

Optionally in the method of the invention, at some point prior to 
15 determining the isotope ratios, the combined peptide sample is fractionated, for 

example using a chromatographic or electrophoretic technique, to reduce its 
complexity so that it is amenable to mass spectrometric analysis, yielding at 
least one fraction containing the isotopically labeled first and second proteins 
and/or peptides. 

20 During mass spectrometric analysis, a normalized isotope ratio 

characterizing metabolites whose concentration is the same in the first and 
second samples is first determined, then the isotope ratio of the first and second 
isotopically labeled metabolites is determined and compared to the normalized 
isotope ratio. A difference in the isotope ratio of the first and second 

25 isotopically labeled metabolites and the normalized isotope ratio is indicative of 

a difference in concentration of the metabolite in the first and second samples. 

When the metabolites are affinity-labeled peptides derived from a 
protein, mass spectrometric analysis can be used to detect at least one peptide 
and identify the protein from which the detected peptide was derived. When the 

30 detected peptide is a signature peptide, the method preferably includes 

determining the mass of the signature peptide and using the mass of the 
signature peptide to identify the protein from which the detected peptide was 
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derived. The invention thus makes it possible to identify a protein in a sample, 
preferably a complex sample, without sequencing the entire protein. In many 
cases the method allows for identification of a protein in a sample without 
sequencing any part of the protein. In a preferred embodiment, the mass of the 
5 signature peptide compared with the masses of reference peptides derived from 

putative proteolytic cleavage of a plurality of reference proteins in a database, 
wherein the mass of the references peptides are adjusted to include the mass of 
the affinity ligand, if necessary. Prior to making this comparison, reference 
peptides are optionally computationally selected to exclude those that do not 

1 0 contain an amino acid upon which the affinity selection is based in order to 

simplify the database comparison. Optionally, the amino acid sequence of the 
detected peptide can be determined and used to identify the protein from which 
the detected peptide was derived. 

When a protein or peptide is present in a one sample but not in another 

15 sample, it can be difficult to determine which sample generated the single peak 

observed during mass spectrometric analysis of the combined sample. This 
problem is addressed by double labeling the first sample, either before or after 
proteolytic cleavage, with two different isotopes or two different numbers of 
heavy atoms. The first sample is partitioned into first and second subsamples, 

20 which are labeled with chemically equivalent moieties containing first and 

second isotopes or numbers of heavy atoms, respectively. Polypeptides in the 
second sample are labeled with a chemically equivalent moiety containing a 
third isotope or number of heavy atoms greater than in the other two cases. The 
first, second and third labeling agents are chemically equivalent yet isotopically 

25 distinct. Preferably, the labeling agents are acylating agents. The three samples 

are combined and optionally fractionating to yield a plurality of peptide 
fractions amenable to mass spectrometric isotope ratio analysis. The presence 
of a doublet during mass spectrometric analysis due to the presence of the first 
and second isotope labeling agents indicates the absence of the protein in the 

30 second sample, and the presence of a single peak due to the presence of the 

third isotope labeling agent indicates the absence of the protein in the first 
sample. 
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Sometimes a solution based fragmentation of a protein mixture generates 
two or more different peptides having identical mass and chromatographic 
separation properties ("isobaric peptides"), such as peptides with the same 
ammo acid composition but different amino acid sequences. In this case, the 
5 composite mass spectrum will not reflect the isotope ratios of the individual 

peptides. However, the mass of one or more of the constituent fragment ions 
generated during gas phase fragmentation of the peptide will be different. These 
fragment ions can therefore be resolved by subjecting the precursor ions to a 
second dimension of mass spectrometry, provided the peptides are isotopically 

10 labeled at either the N - or the C- terminus. Iso topic peaks from the first 

climension spanning a mass range of up to about 20 amu are selected for mass 
spectrometric analysis in the second dimension. Fragmentation prior to the 
second dimension of mass spectrometry can occur by either post-source decay 
or collision-induced (or collision-activated) dissociation (CID or CAD) of the 

1 5 precursor ion. The isotope ratio of those fragment ions that differ between 

peptides can be used to quantify the peptides. 

This problem is not limited to isobaric peptides. When the difference 
between the masses of the labeling agents is 3 amu a problem will occur any 
time the peptide clusters are within 6 amu of each other such that they overlap. 

20 A range of isotope peaks, for example about 6 to about 10 amu range for 

deuterium labeled peptides, is selected for mass spectrometric analysis in the 
second dimension, and unique fragment ions can be located. When a broader 
mass window is selected for use in the second dimension for deuterated 
samples, 2 H 3 and 1 H 3 N-acetyl labeled forms of the peptide will both be present 

25 in the second dimension, and the 2 H 3 and 1 T3 3 labeling will only be found on the 

fragment ions that contain portions of the molecule that were acetylated. 
Quantification can be achieved by measuring the 2 H 3 and X H 3 ratio in the second 
dimension. 

The methods for protein identification and, optionally, quantification 
30 described herein offer the investigator a high degree of experimental flexibility 

and are also very amenable to automation. They are, in addition, extremely 
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sensitive; for example, the use of mass spectrometry to uniquely define the 
signature peptide (by its mass) makes it possible for the isotope labeling method 
of the invention to distinguish among single site protein polymorphisms. 

It should be noted that, while isotope labeling of the proteins or 
constituent peptides is useful for quantification and quantitative comparison of 
proteins and/or peptides in a complex mixture, isotope labeling is not necessary 
to identify proteins in a complex mixture. A protein can be identified by 
comparing the mass of a signature peptide to the masses of peptides in a peptide 
database formed from computational cleavage of a set of proteins. The absence 
of the need to isotopically label the protein or peptides facilitates automation 
and also makes protein identification using database searching algorithms 
easier, since the peptides do not include the mass of an exogenous isotope 
labeling reagent. 

The terms "a", "an", "the", and "at least one" include the singular as well 
as the plural unless specified to the contrary. 

Brief Description of the Drawings 
Figure 1 is a schematic representation of coupled and uncoupled 
methods of the invention. 
20 Figure 2 is a reversed-phase chromatogram of proteins isolated from 

bovine nuclei by chromatography on a.Bandeiraea simplicifolia (BS-II) lectin 
affinity column. Elution was achieved using a 0.20 M solution of N- 
acetylglucosamine. 

Figure 3 is a reversed-phase chromatogram of tryptic digested 
25 glycopeptides isolated from bovine nuclei by chromatography on a BS-II lectin 

affinity column. Elution was achieved using a 0.20 M solution of N- 
acetylglucosamine. 

Figure 4 (a)-(d) shows mass spectra of various glycopeptide fractions 
collected from the reversed phase column. 
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Figure 5 is a reversed-phase chromatogram of (a) a peptide map of 
human serotransferrin and (b) two human serotransferrin glycopeptides isolated 
from a conconavalin A column. 

Figure 6 is a matrix-assisted laser desorption ionization-time of flight 
5 (MALDI-TOF mass spectrum of (a) the first glycopeptide from human 

serotransferrin and (b) the second glycopeptide from human serotransferrin. 

Figure 7 is a reversed-phase chromatogram of (a) glycopeptides isolated 
from human serum and (b) glycopeptides isolated from human serum. 

Figure 8 is a mass spectrum of fractions isolated from human serum 
10 containing (a) the first glycopeptide from human serotransferrin and (b) the 

second glycopeptide from human serotransferrin. 

Figure 9 is a MALDI-mass spectrum of a deuterium labeled peptide 
containing four lysines. 

Figure 10 is a MALDI-TOF mass spectrum of (a) labeled and unlabeled 
15 lysine-containing peptide in negative mode detection and (b) a lysine-containing 

peptide detected in positive mode. 

Figure 11 is a MALDI mass spectrum of a peptide that contains (a) 
lysine and (b) arginine. 

20 Detailed Description of the Invention 

Roughly 90% of the time, the amino acid sequence of a peptide 
fragment having a mass of over 500 daltons will be unique to the protein from 
which it is derived. This varies somewhat with the organism. Because of this 
uniqueness, these peptides are referred to herein as "signature peptides." 

25 Signature peptides are often, but not always, characterized by features such as 

low abundance amino acids such as cysteine or histidine, phosphorylation or 
glycosylation, and antigenic properties. If one were to select from a pool of all 
tiyptic peptides produced from proteolysis of the proteome those peptides that 
contain the low abundance amino acids histidine or cysteine, there would be 

30 between one and four "signature peptides" per protein. The number depends to 

some extent on the size of the protein. 
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A signature peptide is a peptide that is unique to a single protein and 
preferably contains about 6 to about 20 amino acids. Enzymatic digestion of a 
complex mixture of proteins will therefore generate peptides, including 
signature peptides, that can theoretically be used to identify particular proteins 
in the complex mixture. Indeed, liquid chromatography, capillary 
electrophoresis, and mass spectrometry are much more adept at the analysis of 
peptides than the intact proteins from which they are derived. A complex 
mixture of proteins preferably contains at least about 100 proteins, more 
preferably it contains at least about 1000 proteins and it can contain several 
thousand proteins. However, when a complex mixture containing thousands of 
proteins is proteolytically digested, it is probable that a hundred thousand or 
more peptides will be generated during proteolysis. This is beyond the 
resolving power of liquid chromatography and mass spectrometry systems. 

This problem is solved in the present invention by utilizing a selection, 
preferably an affinity selection, after the proteolytic cleavage to select peptide 
fragments that contain specific amino acids, thereby substantially reducing the 
number of sample components that must be subjected to further analysis. The 
method for protein identification of the invention is well-suited to the 
identification of proteins in a complex mixture, and at a minimum includes 
proteolytic cleavage of a protein and affinity selection of the peptides. The 
affinity selection can be effected using an affinity ligand that has been 
covalently, attached to the protein (prior to cleavage) or its constituent peptides 
(after cleavage), or using an endogenous affinity ligand. The affinity selection 
is preferably based on low abundance amino acids or post-translational 
modifications so as to preferentially isolate "signature peptides." The method is 
not limited by the affinity selection method(s) employed and nonlirniting 
examples of affinity selections are described herein and can also be found in the 
scientific literature, for example in M. Wilchek, Meth. Enzvmol . 34, 182-195 
(1974). This approach enormously reduces the complexity of the mixture. If 
desired, two or more affinity ligands (e.g., primary and secondary affinity 
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ligands) can be used, thereby allowing a finer selection. Illustrative examples of 
pre- and post-digestion labeling are shown in Examples IV and V, below. 

Preferably, the affinity selected peptides are subjected to a fractionation 
step to reduce sample size prior to the determination of peptide masses. A 
5 premise of the signature peptide strategy is that many more peptides are 

generated during proteolysis than are needed for protein identification. This 
assumption means that large numbers of peptides potentially can be eliminated, 
while still leaving enough for protein identification. 

The method is not limited by the techniques used for selection and/or 

10 fractionation. Typically, fractionation is carried out using single or 

multidimensional chromatography such as reversed phase chromatography 
(RPC), ion exchange chromatography, hydrophobic interaction chromatography, 
size exclusion chromatography, or affinity fractionation such as immunoaffinity 
and immobilized metal affinity chromatography. Preferably the fractionation 

15 involves surface-mediated selection strategies. Electrophoresis, either slab gel 

or capillary electrophoresis, can also be used to fractionate the peptides. 
Examples of slab gel electrophoretic methods include sodium dodecyl sulfate 
polyacrylamide gel electrophoresis (SDS-PAGE) and native gel electrophoresis. 
Capillary electrophoresis methods that can be used for fractionation include 

20 capillary gel electrophoresis (CGE), capillary zone electrophoresis (CZE) and 

capillary electrochromatography (CEC), capillary isoelectric focusing, 
immobilized metal affinity chromatography and affinity electrophoresis. 

Masses of the affinity-selected peptides, which include the "signature 
peptides," are preferably determined by mass spectrometry, preferably using 

25 matrix assisted laser desorption ionization (MALDf) or electrospray ionization 

(ESI), and mass of the peptides is analyzed using time-of -flight (TOF), 
quadrapole, ion trap, magnetic sector or ion cyclotron resonance mass analyzers, 
or a combination thereof including, without limitation, TOF-TOF and other 
combinations. Preferably the mass of the peptides is determined with a mass 

30 accuracy of about 10 ppm or better; more preferably, masses are determined 

with a mass accuracy of about 5 ppm or better; most preferably they are 
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determined with a mass accuracy of about 1 ppm or better. The lower the ppm 
value, the more accurate the mass determination and the less sequence data is 
needed for peptide identification. 

It should be understood that the term "protein," as used herein, refers to 

5 a polymer of amino acids and does not connote a specific length of a polymer of 

amino acids. Thus, for example, the terms oligopeptide, polypeptide, and 
enzyme are included within the definition of protein, whether produced using 
recombinant techniques, chemical or enzymatic synthesis, or naturally 
occurring. This term also includes polypeptides that have been modified or 

10 derivatized, such as by glycosylation, acetylation, phosphorylation, and the like. 

When the term "peptide" is used herein, it generally refers to a protein fragment 
produced in solution. 

Selection of sample 

15 The method of the invention is designed for use in complex samples 

containing a number of different proteins. Preferably the sample contains at 
least about two proteins; more preferably it contains at least about 100 proteins; 
still more preferably it contains at least about 1000 proteins. A sample can 
therefore include total cellular protein or some fraction thereof. For example, a 

20 sample can be obtained from a particular cellular compartment or organelle, 

using methods such as centrifugal fractionation. The sample can be derived 
from any type of cell, organism, tissue, organ, or bodily fluid, without 
limitation. The method of the invention can be used to identify one or more 
proteins in the sample, and is typically used to identify multiple proteins in a 

25 single complex mixture. It should therefore be understood that when the 

method of the invention is referred to, for simplicity, as a method for identifying 
"a protein" in a mixture that contains multiple proteins, the term "a protein" is 
intended to mean "at least one protein" and thus includes one or more proteins. 
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Fragmentation of proteins 

Fragmentation of proteins can be achieved by chemical, enzymatic or 
physical means, including, for example, sonication or shearing. Preferably, a 
protease enzyme is used, such as trypsin, chymotrypsin, papain, gluc-C, endo 
5 lys-C, proteinase K, carboxypeptidase, calpain, subtilisin and pepsin; more 

preferably, a trypsin digest is performed. Alternatively, chemical agents such as 
cyanogen bromide can be used to effect proteolysis. The proteolytic agent can 
be immobilized in or on a support, or can be free in solution. 

10 Selecting peptides with specific amino acids 

Peptides from complex proteolytic digests that contain low abundance 
amino acids or specific post-translational modifications are selected (purified) to 
reduce sample complexity while at the same time aiding in the identification of 
peptides selected from the mixture. Selection of peptide fragments that contain 

15 cysteine, tryptophan, histidine, methionine, tyrosine, tyrosine phosphate, serine 

and threonine phosphate, O-linked oligosaccharides, or N-Iinked 
oligosaccharides, or any combination thereof can be achieved. It is also 
possible to determine whether the peptide has a C-terminal lysine or arginine 
and at least one other amino acid. 

20 The present invention thus provides for selection of proteolytic cleavage 

fragments that contain these specific amino acids or post-translational 
modifications, and includes a method of purifying individual peptides 
sufficiently that they are amenable to MALDI mass spectrometry (MALDI- 
MS). In view of the fact that MALDI-MS can accommodate samples with 50- 

25 150 peptides and a good reversed phase chromatography (RPC) column can 

produce 200 peaks, a high quality RPC-MALDI-MS system can be expected to 
analyze a mixture of 10,000 to 30,000 peptides. Preliminary studies by others 
with less powerful RPC-electrospray-MS systems support this conclusion (F. 
Hsieh et al., Anal. Chem . 70:1847-1852 (1998)). Selection of ten or less 

30 peptides from each protein would allow this system to deal with mixtures of 

1,000 to 3,000 proteins in the worst case scenario. More stringent selection 
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would increase this number. The selection method chosen is thus very 
important. 

Affinity tags 

5 An affinity tag used for selection can be endogenous to the protein, or it 

can be added by chemical or enzymatic processes. The term "affinity tag," as 
used herein, refers to a chemical moiety that functions as, or contains, an 
affinity ligand that is capable of binding (preferably noncovalently, but covalent 
linkages are contemplated also) to a second, "capture" chemical moiety, such 

10 that a protein or peptide that naturally contains or is derivatized to include the 

affinity tag can be selected (or "captured") from a pool of proteins or peptides 
by contacting the pool with the capture moiety. The capture moiety is 
preferably bound to a support surface, preferably a porous support surface, as a 
stationary phase. Examples of suitable supports include porous silica, porous 

15 titania, porous zirconia, porous organic polymers, porous polysaccharides, or 

any of these supports in non-porous form. 

Preferably the interactions between the affinity tag and the capture 
moiety are specific and reversible (e.g., noncovalent binding or hydrolyzable 
covalent linkage), but they can, if desired, initially be, or subsequently be made, 

20 irreversible (e.g., a nonhydrolyzable covalent linkage between the affinity tag 

and the capture moiety). It is important to understand that the invention is not 
limited to the use of any particular affinity ligand. 

Examples of endogenous affinity ligands include naturally occurring 
amino acids such as cysteine (selected with, for example, an acylating reagent) 

25 and histidine, as well as carbohydrate and phosphate moieties. A portion of the 

protein or peptide amino acid sequence that defines an antigen can also serve as 
an endogenous affinity ligand, which is particularly useful if the endogenous 
amino acid sequence is common to more than one protein in the original 
mixture. In that case, a polyclonal or monoclonal antibody that selects for 

30 families of polypeptides that contain the endogenous antigenic sequence can be 

used as the capture moiety. An antigen is a substance that reacts with products 
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of an immune response stimulated by a specific immunogen, including 
antibodies and/or T lymphocytes. As is known in the art, an antibody molecule 
or a T lymphocyte may bind to various substances, for example, sugars, lipids, 
intermediary metabolites, autocoids, hormones, complex carbohydrates, 
5 phospholipids, nucleic acids, and proteins. As used herein, the term "antigen" 

means any substance present in a peptide that may be captured by binding to an 
antibody, a T lymphocyte, the binding portion of an antibody or the binding 
portion of T lymphocyte. 

A non-endogenous (i.e., exogenous) affinity tag can be added to a 

10 protein or peptide by, for example, first covalently linking the affinity ligand to 

a derivatizing agent to form an affinity tag, then using the affinity tag to 
derivatize at least one functional group on the protein or peptide. Alternatively, 
the protein or peptide can be first derivatized with the derivatizing agent, then 
the affinity ligand can be covalently linked to the derivatized protein or peptide 

15 at a site on the derivatizing agent. An example of an affinity ligand that can be 

covalently linked to a protein or peptide by way chemical or enzymatic 
derivatization is a peptide, preferably a peptide antigen or polyhistidine. A 
peptide antigen can itself be derivatized with, for example, a 2,4-dinitrophenyl 
or fluorescein moiety, which renders the peptide more antigenic. A peptide 

20 antigen can be conveniently captured by an immunosorbant that contains a 

bound monoclonal or polyclonal antibody specific for the peptide antigen. A 
polyhistidine tag, on the other hand, is typically captured by an BVLAC column 
containing a metal chelating agent loaded with nickel or copper. Biotin, 
preferably ethylenediamine terminated biotin, which can be captured by the 

25 natural receptor avidin, represents another affinity ligand. Other natural 

receptors can also be used as capture moieties in embodiments wherein their 
ligands serve as affinity ligands. Other affinity ligands include dinitrophenol 
(which is typically captured using an antibody or a molecularly imprinted 
polymer), short oligonucleotides, and polypeptide nucleic acids (PNA) (which 

30 are typically captured by nucleic acid hybridization). Molecularly imprinted 

polymers can also be used to capture. The affinity ligand is typically linked to a 
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chemical moiety that is capable of derivatizing a selected functional group on a 
peptide or protein, to form an affinity tag. An affinity ligand can, for example, 
be covalently linked to maleimide (a protein or peptide derivatizing agent) to 
yield an affinity tag, which is then used to derivatize the free sulfhydryl groups 
5 in cysteine, as further described below. 

Selecting cysteine-containing peptides 

It is a common strategy to alkylate the sulfhydryl groups in a protein 
before proteolysis. Alkylation is generally based on two kinds of reactions. 

10 One is to alkylate with a reagent such as iodoacetic acid (IAA) or iodoacetamide 

(IAM). The other is to react with vinyl pyridine, maleic acid, or N- 
ethylmaleimide (NEM). This second derivatization method is based on the 
propensity of -SH groups to add to the C=C double bond in a conjugated 
system. Alkylating agents linked to an affinity ligand double as affinity tags 

15 and can be used to select cysteine containing peptides after, or concomitant 

with, alkylation. For example, affinity-tagged iodoacetic acid is a convenient 
selection for cysteine. 

Optionally, the protein is reduced prior to alkylation to convert all the 
disulfides (cystines) into sulfhydryls (cysteines) prior to derivatization. 

20 Alkylation can be performed either prior to reduction (permitting the capture of 

only those fragments in which the cysteine is free in the native protein) or after 
reduction (permitting capture of the larger group containing all cysteine- 
containing peptides, include those that are in the oxidized cystine form in the 
native protein). 

25 Preparation of an affinity tagged N-ethylmaleimide may be achieved by 

the addition of a primary arnine-containing affinity tag to maleic anhydride. 
The actual affinity tag may be chosen from among a number of species ranging 
from peptide antigens, polyhistidine, biotin, dinitrophenol, or polypeptide 
nucleic acids (PNA). Peptide and dinitrophenol tags are typically selected with 

30 an antibody whereas the biotin tag is selected with avidin. When the affinity tag 

includes as the affinity ligand a peptide, and when proteolysis of the protein 
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mixture is accomplished after derivatization using trypsin or lys-C, the peptide 
affinity ligand preferably does not contain lysine or arginine, so as to prevent 
the affinity ligand from also being cleaved during proteolysis. Biotin is a 
preferred affinity ligand because it is selected with very high affinity and can be 
5 captured with readily available avidin/streptavidin columns or magnetic beads. 

As noted above, polyhistidine tags are selected in an immobilized metal affinity 
chromatography (IMAC) capture step. This selection route has the advantage 
that the columns are much less expensive, they are of high capacity, and 
analytes are easily desorbed. 

1 0 Alternatively, cysteme-containing peptides or proteins can be captured 

directly during alkylation without incorporating an affinity ligand into the 
alkylating agent. An alkylating agent is immobilized on a suitable substrate, 
and the protein or peptide mixture is contacted with the immobilized alkylating 
agent to select cysteme-containing peptides or proteins. If proteins are selected, 

15 proteolysis can be conveniently carried out on the immobilized proteins to yield 

immobilized cysteme-containing peptides. Selected peptides or proteins are 
then released from the substrate and subjected to further processing in 
accordance with the method of the invention. 

When alkylation is done in solution, excess affinity tagged alkylating 

20 agent is removed prior to selection with an immobilized capture moiety. Failure 

to do so will severely reduce the capacity of the capture sorbent. This is 
because the tagged alkylating agent is used in great excess and the affinity 
sorbent cannot discriminate between excess reagent and tagged peptides. This 
problem is readily circumvented by using a small size exclusion column to 

25 separate alkylated proteins from excess reagent prior to affinity selection. The 

whole process can be automated (as further described below) by using a 
multidimensional chromatography system with, for example, a size exclusion 
column, an immobilized trypsin column, an affinity selector column, and a 
reversed phase column. After size discrimination the protein is valved through 

30 the trypsin column and the peptides in the effluent passed directly to the affinity 

column for selection. After capture and concentration on the affinity column, 
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tagged peptides are desorbed from the affinity column and transferred to the 
reversed phase column where they were again captured and concentrated. 
Finally, the peptides are eluted with a volatile mobile phase and fractions 
collected for mass spectral analysis. Automation in this manner has been found 
5 to work well. 

Selecting tyrosine-containing peptides 

Like cysteine, tyrosine is an amino acid that is present in proteins in 
limited abundance. It is known that diazonium salts add to the aromatic ring of 

10 tyrosine ortho to the hydroxyl groups; this fact has been widely exploited in the 

immobilization of proteins through tyrosine. Accordingly, tyrosine-containing 
peptides or proteins can be affinity-selected by derivatizing them with a 
diazonium salt that has been coupled at its carboxyl group to a primary amine 
on an affinity ligand, for example through the a-amino group on a peptide tag 

15 as described above. Alternatively, that diazonium salt can be immobilized on a 

suitable substrate, and the protein or peptide mixture is contacted with the 
immobilized diazonium salt to select tyrosine-containing peptides or proteins. 
If proteins are selected, proteolysis can be conveniently carried out on the 
immobilized proteins to yield immobilized tyrosme-containing peptides. 

20 Selected peptides or proteins are then released from the substrate and subjected 

to further processing in accordance with the method of the invention. 

Selecting tryptophan-containing peptides 

Tryptophan is present in most mammalian proteins at a level of <3%. 

25 This means that the average protein will yield only a few tryptophan containing 

peptides. Selective derivatization of tryptophan has been achieved with 2,4- 
dinitrophenylsulfenyl chloride at pH 5.0 (M. Wilcheck et al., Biochem. Biophys. 
Acta 178:1-7 (1972)). Using an antibody directed against 2,4-dinitrophenol, an 
irnmunosorbant was prepared to select peptides with this label. The advantage 

30 of tryptophan selection is that the number of peptides will generally be small. 
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Selecting histidine-containing peptides. 

In view of the higher frequency of histidine in proteins, it would seem at 
first that far too many peptides would be selected to be useful. The great 
strength of the procedure outlined below is that it selects on the basis of the 
5 number of histidines, not just the presence of histidine. Immobilized metal 

affinity chromatography (TMAC) columns loaded with copper easily produce 
ten or more peaks. The fact that a few other amino acids are weakly selected is 
not a problem, and the specificity of histidine selection can, if desired, be 
greatly improved by acetylation of primary amino groups. Fractions from the 

10 IMAC column are transferred to an RPC-MALDI/MS system for analysis. The 

number of peptides that can potentially be analyzed jumps to 100,000-300,000 
in the IMAC approach. An automated IMAC-RPC-MALDI/MS system 
essentially identical to that used for cysteine selection has been assembled. The 
only difference is in substituting an IMAC column for the affinity sorbent and 

15 changes in the elution protocol. Gradient elution in these systems is most easily 

achieved by applying step gradients to the affinity column. After reduction, 
alkylation, and digestion, the peptide mixture is captured on the EMAC column 
loaded with copper. Peptides are isocratically eluted from the IMAC using 
imidazole or a change in pH, and directly transferred to the RPC column where 

20 they are concentrated at the head of the column. The IMAC is then taken off 

line, the solvent lines of the instrument purged at 10 ml/minute for a few 
seconds with RPC solvent A, and then the RPC column is gradient eluted and 
column fractions collected for MALDI-MS. When this is done, the RPC 
column is recycled with the next solvent for step elution of the IMAC column, 

25 the IMAC column is then brought back on line, and the second set of peptides is 

isocratically eluted from the EMAC column and transferred to the RPC column 
where they are readsorbed. The IMAC column is again taken off-line, the 
system purged, and the second set of peptides is eluted from the RPC column. 
This process is repeated until the IMAC column has been eluted. Again, 

30 everything leading up to MALDI-MS is automated. 
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Selecting post-translationally modified proteins. 

Post-translational modification plays an important role in regulation. 
For this reason, it is necessary to have methods that detect specific post- 
translational modifications. Advantageously, the method of the invention can 
distinguish among proteins having a single signature peptide where speciation 
occurs by post-translational modification, if the affinity ligand is associated 
with, or constitutes, the post-translational moiety (e.g., sugar residue or 
phosphate). Among the more important post-translational modifications are i) 
the phosphorylation of tyrosine, serine, or threonine; ii) N-glycosylation; and iii) 
O-glycosylation. 

Selecting phosphoproteins 

In the case of phosphorylated proteins, such as those containing 
phosphotyrosine and phosphoserine, selection can achieved with monoclonal 
antibodies that target specific phosphorylated amino acids. For example, 
immunosorbant columns loaded with a tyrosine phosphate specific monoclonal 
antibody are commercially available. Preferably, all proteins in a sample are 
digested, then the immunosorbant is used to select only the tyrosine phosphate 
containing peptides. As in other selection schemes, these peptides can separated 
by reversed phase chromatography and subjected to MALDI. 

Alternatively, selection of phosphopeptides can be achieved using 
IMAC columns loaded with gallium (M. Posewitz et al., Anal. Chem. 
71(14):2883-2992 (1999)). Phosphopeptides can also be selected using anion 
exchange chromatography, preferably on a cationic support surface, at acidic 
pH. 

In addition, because zirconate sorbents have high affinity for phosphate 
containing compounds (C. Dunlap et al., J. Chromatogr. A 746:199-210 (1996)), 
zirconia-containing chromatography is expected to be suitable for the 
purification of phosphoproteins and phosphopeptides. Zirconate clad silica 
sorbents can be prepared by applying zirconyl chloride dissolved in 2,4- 
pentadione to 500 angstrom pore diameter silica and then heat treating the 
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support at 400°C. Another alternative is the porous zirconate support recently 
described by Peter Carr (C. Dunlap et ah, J. Chromatogr. A 746: 199-210 
(1996)). Phosphopeptides are eluted using a phosphate buffer gradient. In 
many respects, this strategy is the same as that of the IMAC columns. 

5 

Selecting O-linked oligosaccharide containing peptides 

Glycopeptides can be selected using lectins. For example, lectin from 
Bandeiraea simplicifolia (BS-II) binds readily to proteins containing N- 
acetylglucosamine. This lectin is immobilized on a silica support and used to 

10 affinity select O-glycosylated proteins, such transcription factors, containing N- 

acetylglucosamine and the glycopeptides resulting from proteolysis. The 
protocol is essentially identical to the other affinity selection methods described 
above. Following reduction and alkylation, low molecular weight reagents are 
separated from proteins. The proteins are then tryptic digested, the 

15 glycopeptides selected on the affinity column, and then the glycopeptides 

resolved by RPC. In the case of some transcription factors, glycosylation is 
homogeneous and MALDI-MS of the intact glycopeptide is unambiguous. That 
is not the case with the more complex O-linked glycopeptides obtained from 
many other systems. Heterogeneity of glycosylation at a particular serine will 

20 produce a complex mass spectrum that is difficult to interpret. Enzymatic 

deglycosylation of peptides subsequent to affinity selection is indicated in these 
cases. Deglycosylation can also be achieved chemically with strong base and is 
followed by size exclusion chromatography to separate the peptides from the 
cleaved oligosaccharides. 

25 It is important to note that O-linked and N-linked glycopeptides are 

easily differentiated by selective cleavage of serine linked oligosaccharides (E. 
Roquemore et al., Meth. Enzvmol . 230:443-460 (1994)). There are multiple 
ways to chemically differentiate between these two classes of glycopeptides. 
For example, basic conditions in which the hemiacetal linkage to serine is 

30 readily cleaved can be utilized. In the process, serine is dehydrated to form an 

oc,p unsaturated system (C=C-C=0). The C=C bond of this system may be 
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either reduced with NaBH 4 or alkylated with a tagged thiol for further affinity 
selection. This would allow O-linked glycopeptides to be selected in the 
presence of N-linked glycopeptides. The same result could be achieved with 
enzymatic digestion. 

5 

Selecting N-linked oligosaccharide-containing peptides 

As with O-linked oligosaccharide-containing peptides, lectins can be 
used to affinity select N-linked glycopeptides following reductive alkylation and 
proteolysis. To avoid selecting O-linked glycopeptides, the peptide mixture is 

1 0 subjected to conditions that cause selective cleavage O-linked oligosaccharides 

prior to affinity selection using the lectin. Preferably O-linked deglycosylation 
is achieved using a base treatment after reductive alkylation, followed by size 
exclusion chromatography to separate the peptides from the cleaved 
oligosaccharides. To address the potential problem of heterogeneity of 

15 glycosylation, and N-linked glycopeptides are deglycosylated after selection. 

Automation can be achieved with immobilized enzymes, but long residence 
times in the enzyme columns are needed for the three enzymatic hydrolysis 
steps. 

20 Identification of signature peptides and their parent proteins 

After peptides of interest are detected using mass spectrometry, the 
protein from which a peptide originated is determined. In most instances this 
can be accomplished using a standard protocol that involves scanning either 
protein or DNA databases for amino acid sequences that would correspond to 

25 the proteolytic fragments generated experimentally, matching the mass of all 

possible fragments against the experimental data (F. Hsieh et al., Anal. Chem . 
70:1847-1852 (1998); D. Reiber et all, Anal. Chem 70:673-683 (1998)). When 
a DNA database is used as a reference database, open reading frames are 
translated and the resulting putative proteins are cleaved computationally to 

30 generate the reference fragments, using the same cleavage method that was used 

experimentally. Likewise, when a protein database is used, proteolytic cleavage 
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is also performed computationally to generate the reference fragments. In 
addition, masses of the reference peptide fragments are adjusted as necessary to 
reflect derivatizations equivalent to those made to the experimental peptides, for 
example to include the exogenous affinity tag. The presence of signature 
5 peptides in the sample is detected by comparing the masses of the 

experimentally generated peptides with the masses of signature peptides derived 
from putative proteolytic cleavage of the set of reference proteins obtained from 
the database. Software and databases suited to this purpose are readily available 
either through commercial mass spectrometer software and the Internet. 

10 Optionally, the peptide databases can be preselected or reduced in complexity 

by removing peptides that do not contain the amino acid(s) upon which affinity 
selection is based. 

There will, of course, be instances where peptides cannot be identified 
from databases or when multiple peptides in the database have the same mass. 

15 One approach to this problem is to sequence the peptide in the mass 

spectrometer by collision induced dissociation. Ideally this is done with a 
MALDI-MS/MS or ESI-MS/MS instrument. Another way to proceed is to 
isolate peptides and sequence them by a conventional method. Because the 
signature peptide strategy is based on chromatographic separation methods, it is 

20 generally relatively easy to purify peptides for amino acid sequencing if 

sufficient material is available. For example, conventional FTH-based 
sequencing or carboxypeptidase based C-terminal sequencing described for 
MALDI-MS several years ago (D. Patterson et al., Anal. Chem . 67:3971-3978 
(1995)). In cases where 6-10 amino acids can be sequenced from the C- 

25 terminus of a peptide, it is often possible to synthesize DNA probes that would 

allow selective amplification of the cDNA complement along with DNA 
sequencing to arrive at the structure of the protein. 

Internal standard quajitification with signature peptides 
30 There is a growing need to move beyond the massive effort to define 

genetic and protein components of biological systems to the study of how they 
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and other cellular metabolites are regulated and respond to stimuli. The words 
"stimulus" and "stimuli" are used broadly herein and mean any agent, event, 
change in conditions or even the simple passage of time that may be associated 
with a detectable change in expression of at least one metabolite within a cell, 
5 without limitation. For example, a stimulus can be a change in growth 

conditions, pH, nutrient supply, or temperature; contact with an exogenous 
agent such as a drug or microbe, competition with another organism, and the 
like. The term "metabolite" refers, in this context, to a cellular component, 
preferably an organic cellular component, which can change in concentration in 
10 response to a stimulus, and includes large biomolecules such as proteins, 

polynucleotides, carbohydrates and fats, as well as small organic molecules such 
as hormones, peptides, cofactors and the like. 

Accordingly, in this aspect of the invention post-biosynthetic isotope 
labeling of cellular metabolites, preferably proteins and peptides, is utilized to 
15 detect cellular components that are up and/or down regulated in comparison to 

control environments. Metabolites, such as proteins (or peptides if proteolysis 
is employed) in control and experimental samples are post-synthetically 
derivatized with distinct isotopic forms of a labeling agent and mixed before 
analysis. Preferably, the samples are obtained from a "biological environment," 
20 which is to be broadly interpreted to include any type of biological system in 

which enzymatic reactions can occur, including in vitro environments, cell 
culture, cells at any developmental stage, whole organisms, organs, tissues, 
bodily fluids, and the like. As between the two samples, labeled metabolites are 
chemically equivalent but isotopically distinct. In this context, chemical 
25 equivalence is defined by identical chromatographic and electrophoretic 

behavior, such that the two metabolites cannot be separated from each other 
using standard laboratory purification and separation techniques. For example, 
a protein or peptide present in each sample may, after labeling, differ in mass by 
a few atomic mass units when the protein or peptide from one sample is 
30 compared to the same protein or peptide from the other sample (i.e., they are 

isotopically distinct). However, these two proteins or peptides would ideally be 



PCT/US01/14418 



chemically equivalent as evidenced by .their identical chromatographic behavior 
and electrophoretic migration patterns. 

Because >95% of cellular proteins do not change in response to a 
stimulus, proteins (as well as other metabolites) in flux can be readily identified 
5 by isotope ratio changes in species resolved, for example, by 2-D gel 

electrophoresis or 2-D chromatography. Once these proteins are detected, they 
can optionally be identified using the "signature peptide" approach as described 
herein or any other convenient method. One example of how this method of the 
invention can be used is to analyze patterns of protein expression in a breast 

10 cancer cell before and after exposure to a candidate drug. The method can also 

be used to analyze changes in protein expression patterns in a cell or an 
organism as a result of exposure to a harmful agent. As yet another example, 
the method can be used to track the changes in protein expression levels in a cell 
as it is exposed, over time, to changes in light, temperature, electromagnetic 

15 field, sound, humidity, and the like. 

The internal standard method of quantification is based on the concept 
that the concentration of an analyte (A) in a complex mixture of substances may 
be determined by adding a known amount of a very similar, but distinguishable 
substance (A) to the solution and deterrm'ning the concentration of A relative to 

20 A. Assuming that the relative molar response (9t) of the detection system for 

these two substances is known, then 

[A]=[A]9tA 

25 The term A is the relative concentration of A to that of the internal standard A 

and is widely used in analytical chemistry for quantitative analysis. It is 
important that A and A are as similar as possible in chemical properties so that 
they will behave the same way in all the steps of the analysis. It would be very 
undesirable for A and A to separate. One of the best ways to assure a high level 

30 of behavioral equivalency is to isotopically label either the internal standard (A) 

or the analyte (A). 
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As noted above, it is difficult to determine whether a regulatory stimulus 
has caused a single, or a small group of proteins in a complex mixture to 
increase or decrease in concentration relative to other proteins in the sample. 
Detennining the magnitude of this change is an even more difficult problem. 
5 The internal standard method apparently cannot be applied here because i) the 

analytes Ay_ n undergoing change are of unknown structure and ii) it would be 
difficult to select internal standards of nearly identical properties. 

Post-synthetic isotope labeling of proteins in accordance with the 
method of the invention advantageously creates internal standards from proteins 

10 of unknown structure and concentration. Whenever there is a control, or 

reference state, in which the concentration of proteins is at some reference level, 
proteins in this control state can serve as internal standards. In a preferred 
embodiment, constituent peptides are labeled after fragmentation of the proteins 
in the sample. The t imin g of the labeling step provides an opportunity to label 

15 every peptide in the mixture by choosing a labeling method that labels at the N 

or the C terminus of a polypeptide. Application of the labeling method of the 
invention after the proteins have been synthesized has a further advantage. 
Although metabolic incorporation of labeled amino acids has been widely used 
to label proteins, it is not very reproducible and is objectionable in human 

20 subjects. Post-sampling strategies for incorporation of labels are much more 

attractive. 

A key advantage of the isotope labeling method of the invention is that it 
detects relative change, not changes in absolute amounts of analytes. It is very 
difficult to determine changes in absolute amounts analytes that are present at 
25 very low levels. This method is as sensitive to changes in very dilute analytes 

as it is those that are present at great abundance. Another important advantage 
of this approach is that it is not influenced by quenching in the MALDI. This 
means that large number of peptides can be analyzed irrespective of the 
expected quenching. 

30 The isotope labeling method of the invention allows identification of up- 

and down-regulated proteins using the affinity selection methods described 



BNSDOCID: <WO 013630eA2J_> 



WO 01/86306 PCT/US01/14418 



above, 2-D gel electrophoresis, 1-D, 2-D or multi-dimensional chromatography, 
or any combination thereof, and employs either autoradiography or mass 
spectrometry. Examples of radioisotopes and stable mass isotopes that can be 
used to label a metabolite post-biosynthetically include 2 H, 3 H, 13 C, 14 C, 15 N, 
5 17 0, ls O, 32 P, 33 S, 34 S and 3S S, but should be understood that the invention is in no 

way limited by the choice of isotope. An isotope can be incorporated into an 
affinity tag, or it can be linked to the peptide or protein in a separate chemical or 
enzymatic reaction. It should be noted that affinity selection of peptides is an 
optional step in the isotope labeling method of the invention, thus the inclusion 

10 of an affini ty ligand in the labeling agent is optional. 

In one embodiment of the isotope labeling method, proteins are 
isotopically labeled prior to cleavage. Proteins in a control sample are 
derivatized with a labeling agent that contains an isotope, while proteins in an 
experimental sample are derivatized with the normal labeling agent. The 

15 samples are then combined. The derivatized proteins can be chemically or 

enzymatically cleaved either before or after separation. Cleavage is optional; 
isotopically labeled proteins can, if desired, be analyzed directly following a 
fractionation step such as multidimensional chromatography, 2-D 
electrophoresis or affinity fractionation. When the derivatized proteins are 

20 cleaved before separation, the labeling agent preferably contains an a ffin ity 

ligand, and the tagged peptide fragments are first affinity selected, then 
fractionated in a 1-D or 2-D chromatography system, after which they are 
analyzed using mass spectrometry (MS). In instances where the derivatized 
proteins are cleaved after fractionation, 2-D gel electrophoresis is preferably 

25 used to separate the proteins. If the peptides have also been affinity labeled, 

selection of the affinity-tagged peptides can be performed either before or after 
electrophoresis. The objective of fractionation is to reduce sample complexity 
to the extent that isotope ratio analysis can be performed, using a mass 
spectrometer, on individual peptide pairs. 

30 Mass spectrometric analysis can be used to determine peak intensities 

and quantitate isotope ratios in the combined sample, determine whether there 
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has been a change in the concentration of a protein between two samples, and to 
facilitate identification of a protein from which a peptide fragment, preferably a 
signature peptide, is derived. Preferably, changes in peptide concentration 
between the control and experimental samples are determined by isotope ratio 
5 MALDI-mass spectrometry because MALDI-MS allows the analysis of more 

complex peptide mixtures, but ESI-MS may also be used when the peptide 
mixture is not as complex. In a complex combined mixture, there may be 
hundreds to thousands of peptides, and many of them will not change in 
concentration between the control and experimental samples. These peptides 

10 whose levels are unchanged are used to establish the normalized isotope ratio 

for peptides that were neither up nor down regulated. All peptides in which the 
isotope ratio exceeds this value are up regulated. In contrast, those in which the 
ratio decreases are down regulated. A difference in relative isotope ratio of a 
peptide pair, compared to peptide pairs derived from proteins that did not 

1 5 change in concentration, thus signals a protein whose expression level did 

change between the control and experimental samples. If the peptide 
characterized by an isotope ratio different from the normalized ratio is a 
signature peptide, this peptide can be used according to the method of the 
invention to identify the protein from which it was derived. 

20 In another embodiment of the isotope labeling method of the invention, 

isotope labeling takes place after cleavage of the proteins in the two samples. 
Derivatization of the peptide fragments is accomplished using a labeling agent 
that preferably contains an affinity ligand. On the other hand, an affinity ligand 
can be attached to the peptides in a separate reaction, either before or after 

25 isotopic labeling. If attached after isotopic labeling, the affinity ligand can be 

attached before or after the samples are combined. The peptide fragments in the 
combined mixture are affinity selected, then optionally fractionated using a 1-D 
or multi-dimensional chromatography system, or a capillary or slab gel 
electrophoretic technique, after which they are analyzed using mass 

30 spectrometry. In instances where the peptides are not affinity tagged, they are 

either affinity selected based on their inherent affinity for an immobilized ligand 
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(preferably using IMAC or immobilized antibody or lectin) or analyzed without 
selection. 

Alkylation with isotopicalfy distinct r eagents 
5 Proteins in control and experimental samples can be alkylated using 

different isotopically labeled iodoacetic acid (ICH2COOH) subsequent to 
reduction. In the case of radionuclide derivatized samples, the control is, for 
example, derivatized with 14 C labeled iodoacetic acid and the experimental 
sample with 3 H labeled iodoacetate. Polypeptides thus labeled can be resolved 

10 by 2-D gel electrophoresis, as described in more detail below. When mass 

spectrometry is used in detection, normal iodoacetate can be used to derivatize 
the control and deuterated iodoacetate the experimental sample. 

Based on the fact that proteins from control and experimental samples 
are identical in all respects except the isotopic content of the iodoacetate 

15 alkylating agent, their relative molar response (3fc) is expected to be 1. This has 

several important ramifications. When control and experimental samples are 
mixed: 

A=AA 

20 In this case A will be i) the same for all the proteins in the mixture that do not 

change concentration in the experimental sample and ii) a function of the 
relative sample volumes mixed. If the protein concentration in the two samples 
is the same and they are mixed in a 1/1 ratio for example, then A=l. With a 
cellular extract of 20,000 proteins, A will probably be the same for >19,900 of 

25 the proteins in the mixture. The concentration of a regulated protein that is 

either up- or down-regulated is expressed by the equation: 

A exptl= A contl A6 

30 where is a protein from the experimental sample that has been 

synthetically labeled with a derivatizing agent, A contl is the same protein from 
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the control sample labeled with a different isotopic form of the derivatizing 
agent, and 6 is the relative degree of up- or down-regulation. Because A is an 
easily determined constant derived from the concentration ratio of probably 
>95% of the proteins in a sample, 5 is readily calculated and proteins in 
5 regulatory flux easily identified. 

Isotopic labeling of amines 

If not included as part of the alkylating agent, an isotope label can be 
applied to the peptide as part of an affinity tag (if affinity selection is 

10 contemplated), or at some other reactive site on the peptide. Although 

application of the internal standard isotopic label in the affinity tag is 
operationally simpler and, in some cases, more desirable, it requires that each 
affinity tag be synthesized in at least two isotopic forms. Amine-labeling in a 
separate step (i.e., uncoupling the label and the affinity ligand) is therefore a 

15 preferred alternative. 

Peptides that are generated by trypsin digestion (as well as those 
generated by many other types of cleavage reactions) have a primary amino 
group at their amino-terminus in all cases except those in which the peptide 
originated from a blocked arnino-terrninus of a protein. Moreover, the 

20 specificity of trypsin cleavage dictates that the C-terminus of signature peptides 

will have either a lysine or arginine (except the C-terminal peptide from the 
protein). In rare cases there may also be a lysine or arginine adjacent to the C- 
terminus. Primary amino groups are easily acylated with, for example, acetyl 
N-hydroxysuccinimide (ANHS). Thus, control samples can be acetylated with 

25 normal ANHS whereas experimental tryptic digests can be acylated with either 

13 CH 3 CO-NHS or CD 3 CO-NHS. Our studies show that the e-amino group of all 
lysines can be derivatized in addition to the amino-terminus of the peptide, as 
expected. This is actually an advantage in that it allows a determination of the 
number of lysine residues in the peptide. 

30 Essentially all peptides in both samples will be derivatized and hence 

distinguishable from their counterparts using mass spectrometry. This means 
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that any affinity selection method or combination of affinity selection methods 
(other than possibly those that select for arginine or lysine, which contain free . 
amines) can be used at any point in the process to obtain a selected population 
enriched for signature peptides. For example, isotope labeling at amines can be 
5 used to identify changes in the relative amounts of peptides selected on the basis 

of cysteine, tryptophan, histidine, and a wide variety of post-translational 
modifications. In this preferred embodiment of the method, isotopic labeling 
and affinity labeling are two independent and distinct steps, and virtually all 
peptides are isotopically labeled. This provides significantly more flexibility 
1 0 and greater control over the production of signature peptides than is possible 

when the alkylating agent doubles as the isotope labeling agent. 

Isotopic labeling of hydroxyls arid other functional groups 

While acetylation is a convenient labeling method for proteins and their 
1 5 constituent peptides, other labeling methods may be useful for other types of 

cellular metabolites. For example, acetic anhydride can be used to acetylate 
hydroxyl groups in the samples, and trimethylchlorosilane can be used for less 
specific labeling of functional groups including hydroxyl groups, carboxylate 
groups and amines. 

20 

Interpretation of the spectra 

Iso.topically labeled samples (control and experimental) are mixed, then 
subjected to mass spectrometry. In the case of labeled proteins (where no 
proteolytic cleavage is carried out), the proteins are typically separated using 

25 2D-gel electrophoresis, multidimensional chromatography, or affinity 

fractionation such as immuno affinity chromatography. Proteins from the 
control and experimental samples will comigrate, since neither isoelectric 
focusing (IEF), sodium dodecyl sulfate polyacrylamide gel electrophoresis 
(SDS-PAGE), nor chromatographic systems can resolve the isotopic forms of a 

30 protein. In the case of labeled peptides (whether or not affinity selected), 

peptides are optionally subjected to fractionation (typically using reversed phase 
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chromatography or ion exchange chromatography) prior to analysis using mass 
spectrometry. 

Radioisotope counting techniques can be used to discriminate between 
3 H and 14 C, and a mass spectrometer can readily differentiate between deuterated 
5 and normal species, either as proteolytic fragments or in the whole protein when 

it is of low (that is, under about 15 kD) molecular weight, allowing ratios of 
protein abundance between the two samples to be established. The relative 
abundance of most proteins will be the same and allow A to be calculated. A 
second group of proteins will be seen in which the relative abundance of 

1 0 specific proteins is much larger in the experimental sample. These are the up- 

regulated proteins. In contrast, a third group of proteins will be found in which 
the relative abundance of specific proteins is lower in the experimental sample. 
These are the down-regulated proteins. The degree (6) to which proteins are up- 
or down-regulated is calculated based on the computed value of A 

15 A more detailed analysis of the interpretation of the resulting mass 

spectra is provided using amine-labeled proteins as an example. Signature 
peptides of experimental samples in this example are acetylated at the amino- 
termini and on e-amino groups of lysines with either 13 CH 3 CO- or CD 3 CO- 
residues, therefore any particular peptide will appear in the mass spectrum as a 

20 doublet. In the simplest case where i) trideutero-acetic acid is used as the 

labeling agent, ii) the C-terminus is arginine, iii) there are no other basic amino 
acids in the peptide, and iv) the control and experimental samples are mixed in 
exactly a 1/1 ratio before analysis, i.e., A=l, the spectrum shows a doublet with 
peaks of approximately equal height separated by 3 amu. With 1 lysine the 

25 doublet peaks were separated by 6 amu and with 2 lysine by 9 amu. For each 

lysine that is added the difference in mass between the experimental and control 
would increase an additional 3 amu. It is unlikely in practice that mixing would 
be achieved in exactly a 1/1 ratio. Thus A will have to be determined for each 
sample and varies some between samples. Within a given sample, A will be the 

30 same for most peptides, as will also be the case in electrophoresis. Peptides that 

deviate to any extent from the average value of A are the ones of interest. The 
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extent of this deviation is the value 6, the degree of up- or down-regulation. As 
indicated above, A will be the same for greater than 95% of the proteins, or 
signature peptides in a sample. 

As noted above, arnino acids with other functional groups are 
5 occasionally labeled. In the presence of a large excess of acylating agent 

hydroxyl groups of serine, threonine, tyrosine, and carbohydrate residues in 
glycoconjugates and the imidazole group of histidine can also be derivatized. 
This does not interfere with quantification experiments, but complicates 
interpretation of mass spectra if groups other than primary amines are 
10 derivatized. In the case of hydroxyl groups, esters formed in the derivatization 

reaction are readily hydrolyzed by hydroxylamine under basic conditions. 
Aclylation of imadazole groups on the other hand occurs less frequently than 
esterification and is perhaps related to amino acid sequence around the histidine 
residue. 

15 Another potential problem with the interpretation of mass spectra in the 

internal standard method of the invention can occur in cases where a protein is 
grossly up- or down-regulated. Under those circumstances, there will 
essentially be only one peak. When there is a large down-regulation this peak 
will be the internal standard from the control. In the case of gross up-regulation, 

20 this single peak will have come from the experimental sample. The problem is 

how to know whether a single peak is from up- or down-regulation. This is 
addressed by double labeling the control with CH 3 CO-NHS and 13 CH 3 CO-NHS. 
Because of the lysine issue noted above, it is necessary to split the control 
sample into two lots and label them separately with CH 3 CO-NHS and 13 CH 3 CO- 

25 NHS, respectively, and then remix. When this is done the control always 

appears as a doublet separated by 1-2 amu, or 3 amu in the extreme case where 
there are two lysines in the peptide. When double labeling the control with 32 C 
and 13 C acetate and the experimental sample with trideuteroacetate, spectra 
would be interpreted as follows. A single peak in this case would be an 

30 indicator of strong up-regulation. The presence of the internal standard doublet 

alone would indicate strong down-regulation. 
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Another potential problem with the double labeled internal standard is 
how to interpret a doublet separated by 3 amu. Because the control sample was 
labeled with CH 3 CO-NHS and 13 CH 3 CO-NHS, this problem can arise only when 
the signature peptide has 2 lysine residues and is substantially down-regulated 
5 to the point that there is little of the peptide in the experimental sample. The 

other feature of the doublet would be that the ratio of peak heights would be 
identical to the ratio in which the isotopically labeled control peptides were 
mixed. Thus, it may be concluded that any time a doublet appears alone in the 
spectrum of a sample and A is roughly equivalent to that of the internal standard 
10 that i) the two peaks came from the control sample and ii) peaks from the 

experimental sample are absent because of substantial down regulation. 

Software development 

The isotope labeling method of the invention allows the identification of 

15 the small number of proteins (peptides) in a sample that are in regulatory flux. 

Observations of spectra with 50 or fewer peptides indicate that individual 
species generally appear in the spectra as bundles of peaks consisting of the 
major peptide ion followed by the U C isotope peaks. Once a peak bundle has 
been located, peak ratios within that bundle are evaluated and compared with 

20 adjacent bundles in the spectrum. Based on the isotopes used in labeling, 

simple rules can be articulated for the identification of up- and down-regulated 
peptides in mass spectra. Software can be written that apply these rules for 
interpretation. 

Data processed in this way can be evaluated in several modes. One is to 
25 select a given peptide and then locate all other peptides that are close in 6 value. 

All peptides from the same protein should theoretically have the same 6 value 
(i.e., the same relative degree of up- or down-regulation). For example, when 
more than one protein is present in the same 2-D gel spot there is the problem of 
knowing which peptides came from the same protein. The 6 values are very 
30 useful in this respect, and provide an additional level of selection. The same is 

true in 2-D chromatography. 3-D regulation maps of chromatographic retention 
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time vs. peptide mass vs. 6 can also be constructed. This identifies proteins that 
are strongly up- or down-regulated without regard to the total amount of protein 
synthesized. In some experiments, one or more groups of proteins may be 
identified that have similar 6 values, and identification of the members of a 
5 group may elucidate metabolic pathways that had not previously been 

characterized. 

The internal standard method applied to 2-D gels 

Advantageously, the internal standard method of the invention can be 

10 used in concert with conventional 2-D gel electrophoresis. The great advantage 

of 2-D electrophoresis is that it can separate several thousand proteins and 
provide a very good two dimensional display of a large number of proteins. The 
method of the invention allows this two dimensional display to be used to 
identify those species that are up- or down-regulated. Researchers in the past 

15 have tried to do this by comparing the staining density of proteins from different 

experiments (L. Anderson et al., Electrophoresis 17:443-453 (1997)); S. 
Pederson et al., Cell 14:179-190 (1977). However staining is not very 
quantitative, it is difficult to see those proteins that are present in small amounts, 
and multiple electrophoresis runs are required. 

20 The detection and quantitation problems in 2-D gel electrophoresis can 

be solved by post-biosynthetically derivatizing proteins with either 
radionuclides or stable isotope labeling agents before electrophoresis to 
facilitate detection and quantification. The great advantage of this approach is 
that the labeling agents do not have to be used in the biological system. This 

25 circumvents the necessity of in vivo radiolabeling that is so objectionable in 

human studies with current labeling techniques. A second major advantage is 
that the degree of up- or down-regulation can be determined in a single analysis 
by using combinations of isotopes in the labeling agents, i.e., 14 C and 3 H, *H and 
2 H, or 12 C and 13 C labels. Control samples are labeled with one isotope while 

30 experimental samples are labeled with another. 
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Two preferred methods were described above for labeling polypeptides 
post-biosynthetically: (a) labeling cysteine during alkylation and reduction of 
sulfhydryls and (b) labeling by acetylation of free amino groups. Labeling 
through reduction and alkylation of disulfides is obviously the easiest way and 
5 the most preferred for subsequent electrophoretic analysis because it does the 

least to disturb the charge. 

Radioisotopes. Deteraiining the ratio of radionuclides in 2-D gels 
requires a special detection method. The energy of P particles from 3 H is 
roughly 0.018 Mev whereas the radiation from 14 C is approximately 0.15 Mev. 

10 This difference in energy is the basis for discriminating between these two 

radionuclides. Counting 3 H requires a very thin mylar window. This fact can be 
exploited for differential autoradiographic detection with a commercial imager 
(e.g., a CYCLONE Storage Phosphor System, Packard, Meriden, CT). Modern 
imagers work by imposing a scintillator screen between the gel and the imager. 

15 Using a 14 C control and an absorption filter to block 3 H P radiation allows for 

measurement of radiation intensity for the control alone. Removing the filter 
and performing the autoradiographic detection again gives an intensity for 3 H + 
14 C. Using densitometry, it is possible to determine density ratios between 
different spots on the same autoradiogram and between autoradiograms. The 

20 limitation of this approach is that it is difficult to recognize i) proteins that only 

increase slightly in concentration, ii) up- or down-regulation in a spot that 
contains multiple proteins, and iii) proteins that are substantially down- 
regulated. Down-regulation will be recognized by switching the isotopes, i.e., 
3 H is used as the control label and 14 C as the experimental labeling agent. Once 

25 a protein spot is seen that appears to be up- or down-regulated, much better 

quantitation can be achieved by excising the spot and using scintillation 
methods for double label counting. 

Phosphorylation of proteins with 32 P labeled nucleotides and 
glycosylation in mammalian systems with 14 C labeled N-acetylglucosamine are 

30 also envisioned, allowing studies of post-translational modifications that lend 

themselves to multi-isotope labeling and detection strategies. 
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There are several advantages of this radioisotope version of the internal 
standard as applied to 2-D gel electrophoresis. One is that it allows a large 
number of proteins to be screened for up- or down-regulation from a single 
sample, in a single run, with a single gel. A second is that excision of spots is 
5 not required, i.e., the degree of manual manipulation is minimal. Yet another 

advantage is that inter-run differences between gels and in the execution of the 
method have no impact on the success of the method. 

Stable isotopes. Proteins that have been reduced and alkylated with 
either ICH^COOH or ICD 2 COOH and mixed before electrophoresis are used to 

1 0 produce peptide digests in which a portion of cysteine containing peptides are 

deuterium labeled. These peptides appear as doublets separated by 2 amu in the 
MAT DT spectrum. In those cases where there are several cysteine residues in a 
peptide, the number of cysteines determines the difference in mass between the 
control and experimental samples. For each cysteine, the difference in mass 

15 increases by 2 amu. 13 C labeling can also be used. The A term is derived from 

isotope ratios in several adjacent protein spots on the gel whereas 8 is computed 
from the ratio in the target spot. Only those peptides that deviate from the 
average value of A are targets for further analysis. This version of the internal 
standard method has most of the advantages of the radioisotope method in terms 

20 of quantification, use of a single sample and gel, and reproducibility. The radio- 

and stable-isotope strategies can also be combined and applied to 2D gel 
electrophoresis. The advantage of combining them is that only those spots 
which appear to have been up- or down-regulated by radioactive analysis are 
subjected to MALDI-MS. When stable and radio-labeled peptides are used in 

25 the same experiment, the stable isotopes are a way to identify and fine tune 

quantification. 



Construction of temporal maps 

The discussion above would imply that regulation is a process that can 
30 be understood with single measurements, i.e., after a stimulus has been applied 

to a biological system one makes a measurement to identify what has been 
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regulated. Single measurements at the end of the process only identify the cast 
of characters. Regulation involves adjusting, directing, coordinating, and 
managing these characters. The issue in regulation is to understand how all 
these things occur. Regulation is a temporal process involving a cascade of 
5 events. Consider, for example, the hypothetical case in which an external 

stimulus might cause modification of a transcription factor, which then interacts 
with another transcription factor, the two of which initiate transcription of one 
or more genes, which causes translation, and finally post-translational 
modification to synthesize another transcription factor, etc. Temporal analysis 

10 brings a lot to understanding this process. Global analysis of protein synthesis 

in response to a variety of stimuli has been intensely examined and at least two 
mapping strategies have been developed (R. VanBogelen et al., in F. Neidhardt 
et al., Ed. Escherichia coli and Salmonella: Cellular and Molecular Biology, 
2nd Ed. ASM Press, Washington D.C. , pp. 2067-2117); H. Zhang et al., J. 

15 Mass Spec. 31:1039-1046 (1996)). 

A temporal map of protein expression can be constructed by first 
identifying all species that change in response to a stimulus, then performing a 
detailed analysis of the regulatory process during protein flux. Identification of 
those proteins affected by the stimulus is most easily achieved by a single 

20 measurement after the regulatory event is complete and everything that has 

changed is in a new state of regulation. Both chromatographic and 
electrophoretic methods can be used to contribute to this level of understanding. 
The regulatory process during protein flux is then analyzed at short time 
intervals and involves many samples. The initial identification process yields 

25 information on which species are in flux, their signature peptides, and the 

chromatographic behavior of these peptides. As a result, the researcher thus 
knows which samples contain specific signature peptides and where to find 
them in mass spectra. Quantitating the degree to which their concentration has 
changed with the internal standard method is straightforward. The resulting 

30 data allows temporal maps of regulation to be constructed, and the temporal 

pattern of regulation will provide information about the pathway of response to 



BNSDOCID: <WO_ 01 86306A2_l_> 



WO 01/86306 



47 



PCT/US01/14418 



the stimulus. The invention thus further provides a method for developing 
algorithms that identify signature peptides in regulatory change. 

Microfabricated analytical systems 
5 The method of the invention is amenable to automation by integrating 

most of the analytical steps in a single instrument. Alkylation, reduction, 
proteolysis, affinity selection, and reversed phase chromatography (RPC) can be 
executed within a single multidimensional chromatographic system. Samples 
collected from this system are manually transferred to MALDI plates for mass 
10 spectrometric analysis. In one embodiment, the invention provides a single 

channel integrated system. In a preferred embodiment, however, the invention 
thus provides a microfabricated, integrated, parallel processing, microfluidic 
system that carry out all the separation components of analysis on a single chip. 



15 EXAMPLES 

The present invention is illustrated by the following examples. It is to 
be understood that the particular examples, materials, amounts, and procedures 
are to be interpreted broadly in accordance with the scope and spirit of the 
20 invention as set forth herein 



Example I. 

Signature Peptide Approach To Detecting Proteins in Complex Mixtures 

25 The objective of the work presented in this example was to test the 

concept that tryptic peptides may be used as analytical surrogates of the protein 
from which they were derived. See Geng et al., Journal of Chromatography A, 
870 (2000) 295-3 1 3 ; Ji et al., Journal of Chromatography B, 745 (2000) 1 97- 
210. Proteins in complex mixtures were digested with trypsin and classes of 

30 peptide fragments selected by affinity chromatography (in this case, lectin 

columns were used). Affinity selected peptide mixtures were directly 
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transferred to a high-resolution reversed-phase chromatography column and 
further resolved into fractions that were collected and subjected to matrix- 
assisted laser desorption ionization (MALDI) mass spectrometry. The presence 
of specific proteins was determined by identification of signature peptides in the 
5 mass spectra. 

Advantages of this approach are that (i) it is easier to separate peptides 
than proteins, (ii) native structure of the protein does not have to be maintained 
during the analysis, (iii) structural variants do not interfere and (iv) putative 
proteins suggested from DNA databases can be recognized by using a signature 
10 peptide probe. 



Materials and Methods 

Materials. Human serotransferrin, human serum, iV-tosyl-L- 
phenylalanine chloromethyl ketone (TPCK)-treated trypsin, concanavalin A 

1 5 (Con A), Bandeiraea simplicifolia (BS-II) lectin, 

tris(hydroxymethyl)aminomethane (Tris base), iodoacetic acid, 
tris(hydroxymethyl)aminomethane hydrochloride (Tris acid), cysteine, 
dithiothreitol (DTT), AT-tosyl-L-lysyl chloromethyl ketone (TLCK), andiV- 
acetyl-D-glucosamine were purchased from Sigma (St. Louis, MO, USA). 

20 Nuclear extract from calf thymus was provided by Professor M. Bina 

(Department of Chemistry, Purdue University, W. Lafayette, IN, USA). 
LiChrospher Si 1000 (10 /mi, 1000 A) was obtained from Merck (Darmstadt, 
Germany). 3,5-Dimethoxy-4-hydroxy-cinnamic acid (sinipinic acid), 3- 
aminopropyltriethoxysilane, polyacrylic acid (PAA), and dicyclohexyl 

25 carbodiimide (DCC), d 3 -C 1 acetic anhydride were purchased from Aldrich 

(Milwaukee, WI, USA). Methyl-a-D-mannopyranoside was obtained from 
Calbiochem (La Jolla, CA, USA). Toluene, 4-dioxane and dimethylsulfoxide 
(DMSO) were purchased from Fisher Scientific (Fair Lawn, NJ, USA). N- 
Hydroxyl succinimide (NHS) and high-performance liquid chromatography 

30 (HPLC)-grade trifluoroacetic acid (TFA) were purchased from Pierce 

(Rockford, IL, USA). HPLC-grade water and acetonitrile (ACN) were 
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purchased from EM science (Gibbstown, NJ, USA). All reagents used directly 
without further purification. 

Synthesis of lectin column. A 1-g of LiChrospher Si 1000 was activated 
for 5 hours at room temperature by addition of 40 ml 6 M HQ. The silica 
5 particles were then filtered and washed to neutrality with deionized water after 

which they were dried initially for 2 hours at 105°C and then at 215°C 
overnight. Silica particles thus treated were reacted with 0.5% 3- 
aminopropyltriethoxysilane in 10 ml toluene for 24 hours at 105 °C to produce 
3-aminopropylsilane derivatized silica (APS silica). Polyacrylic acid (0.503 g; 

10 M T 450 000), AT-hydroxysuccinamide (1.672 g), and dicyclohexyl carbodiimide 

(6.0 g) were dissolved into 40 ml DMSO and shaken for 3 hours at room 
temperature to activate the polymer. The reaction mixture was filtered and the 
activated polymer harvested in the supernatant. Acrylate polymer was grafted 
to the silica particles by adding the APS silica described above to the activated 

15 acrylate polymer containing supernatant. Following a 12-hour reaction at room 

temperature, the particles were filtered and washed sequentially with 50 ml 
DMSO, 50 ml dioxane and 50 ml deionized water. This procedure produces a 
polyacrylate coated silica with residual JV-acyloxysuccinamide activated groups, 
specified as NAS-PAA silica. NAS-PAA silica (0.5 g) was added to 10 ml of 

20 0. 1 M NaHC0 3 (pH 7.5) containing 0.2 M methyl-a-D-mannopyranoside and 

200 mg Con A. The reaction was allowed to proceed with shaking for 12 hr at 
room temperature after which immobilized Con A sorbent was isolated by 
centrifugation and was washed with O.lMTris buffer (pH 7.5). The sorbent 
was stored in 0.1 MTris buffer (pH 7.5) with 0.2 M NaCl until use. 

25 NAS-PAA silica (0.3 g) was added to 10 ml of 0.1 M NaHC0 3 buffer 

(pH 7.5) containing 0.2 M JNT-acetyl-D-glycosamine and 20 mg BS-II lectin. The 
reaction was allowed to proceed with shaking for 12 hours at room temperature 
after which the immobilized lectin containing particles were isolated by 
centrifugation, washed with 0.1 M (pH 7.5) Tris buffer, and packed into a 

30 stainless steel column (50X4.6 mm) using the wash buffer and a high-pressure 
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pump from Shandon Southern Instruments (Sewickley, PA, USA). Affinity 
columns were washed by 0.1 M Tris (pH 7.5) with 0.2 M NaCl before use. 

Proteolysis. Human serotransferrin (5 mg), nuclear extract from bovine 
cells, or human serum were reduced and alkylated in the same way by adding to 
5 1 ml 0.2 M Tris buffer (pH 8.5) containing 8 M urea and 10 mM DTT. After a 

2-h incubated at 37°C, iodoacetic acid was added to a final concentration of 20 
mM and incubated in darkness on ice for a further 2 hours. Cysteine was then 
added to the reaction mixture to a final concentration of 40 mM and the reaction 
allowed to proceed at room temperature for 30 min. After dilution with 0.2 M 
1 0 Tris buffer to a final urea concentration of 3 M, TPCK-treated trypsin (2% , w/w, 

of enzyme to that of the protein) was added and incubated for 24 hours at 37° C. 
Digestion was stopped by adding TLCK in a slight molar excess over that of 
trypsin. 

Chromatography. All chromatographic steps were performed using an 

15 Integral microanalytical workstation from PE Biosystems (Framingham, MA, 

USA). Tryptic digested human serotransferrin (0.1 ml) was injected onto the 
Con A affinity column that had been equilibrated with a loading buffer 
containing 1 mM CaCl^ 1 mMMgCl 2 , 0.2 M NaCl and 0.1 MTris-HCl (pH 7.5). 
The Con A column was eluted sequentially at 1 ml/min with two column 

20 volumes of loading buffer and then 0.2 M methyl-cc-D-mannopyranoside in 0. 1 

MTris (pH 6.0). Analytes displaced from the affinity column with 0.2 M 
methyl-a-D-mannopyranoside were directed to a 250X4.6 mm Peptide C^ (PE 
Biosystems) analytical reversed-phase HPLC column, which had been 
equilibrated for 5 minutes at 1.0 ml/min with 5% ACN containing 0.1% 

25 aqueous TFA. The glycopeptides were then eluted at 1.0 ml/min in a 35-min 

linear gradient to 50% ACN in 0.1% aqueous TFA. Eluted peptides were 
monitored at 220 nm and fractions manually collected for matrix-assisted laser 
desorption ionization time-of-flight (MALDI-TOF) analysis.- 

Tryptic digested human serum (0.2 ml) was injected on the Con A and 

30 reversed-phase HPLC column using conditions similar to those used with 

human serotransferrin with the following exceptions. The reversed-phase 
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column was washed for 10 minutes at 1 ml/min with 10% ACN containing 
0.1% aqueous TFA and the glycopeptides were eluted at 1 ml/min with a 120- 
min linear gradient to 70% ACN containing 0.1% aqueous TFA. 

Nuclear extract (0.1 ml) was injected onto the BS-II column which had 
5 been equilibrated with loading buffer, 0.2 M NaCl with 0.1 M Tris (pH 7.5). 

After sample loading the BS-II column was washed with 20 column volumes of 
loading buffer and then eluted with 0.2MJV-acetyl-D-glycosamine in the 
loading buffer. Glycopeptides and glycoproteins eluted from the BS-II column 
were transferred to a reversed-phase column, which had been equilibrated for 5 
10 minutes at 1 ml/min with 5% ACN containing 0.1% aqueous TFA. The 

glycoproteins were then eluted at 1 ml/min with a 25-min linear gradient to 35% 
ACN containing 0.1% aqueous TFA. The glycopeptides were eluted at 1 
ml/min with a 35-min linear gradient to 50% ACN containing 0.1% aqueous 
TFA 

15 Synthesis ofd 3 -C 2 N-acetoxysuccinamide 1 . A solution of 4.0 g (34.8 

mmol) of ^hydroxysuccinimide in 10.7 g (105 mmol) of d^-C 1 acetic anhydride 
was stirred at room temperature. After 10 minutes, white crystals began to 
deposit. The liquid phase was allowed to evaporate and the crystalline residue 
extracted with hexane which is allowed to dry in vacuum. The yield of the 

20 substances was 5.43 g (100%), m.p. 133-134°C. 

Acetylation reaction with the peptides. A 3-fold molar excess of N- 
acetoxysuccinamide and d 3 -C 1 JV-acetoxysuccinamide was added individually to 
the two equal aliquots of 1 mg/ml peptide solution in phosphate buffer at pH 
7.5, respectively. The reaction was carried at room temperature. After stirring 

25 for about 4-5 hours, equal aliquots of the two samples were mixed and purified 

on a C 18 column. The collected fraction were then subjected to MALDI-MS. 

MALDI-TOF-MS. MALDI-TOF-MS was performed using a Voyager 
DE-RP BioSpectrometry workstation (PE Biosystems). Samples were prepared 
by mixing a 1-^1 aliquot with 1 fd of matrix solution. The matrix solution for 

30 glycopeptides was prepared by saturating a water-ACN (50:50, v/v), 3% TFA 

solution with sinipinic acid. A 1-^1 sample volume was spotted into a well of 
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the MALDI sample plate and allowed to air-dry before being placed in the mass 
spectrometer. All peptides were analyzed in the linear, positive ion mode by 
delayed extraction using an accelerating voltage of 20 kV unless otherwise 
noted. External calibration was achieved using a standard "calibration 2" 
5 mixture from PE Biosystems. 

The matrix for acetylated peptides was a solution of 3% TFA, ACN- 
water (50:50) solution saturated with a a-cyano-4-hydroxycinnamic acid. 
- Peptide quantitation was performed on MALDI-TOF-MS in the reflector mode 
as described above. Ten spectra were collected from each sample spot and the 

10 peak intensities averaged for each spot. A linear equation was deduced from the 

ion current intensity ratio of the deuterium-labeled and the unlabeled acetylated 
peptides versus the ratio of the amount of these two peptides. 

The effect of buffer type and concentration on mass determination by 
MALDI-time-of-flight mass spectrometry is discussed in Amirii et al., Journal 

15 of Chromatography A, 894 (2000) 345-355. 

Results and Discussion 

Analytical strategy. The work reported here is based on the proposition 
that signature peptides generated by tryptic digestion of sample proteins may be 
selected from complex mixtures and be used as analytical surrogates for the 

20 protein from which they were derived. The rationale for this approach is that (i) 

it will be easier to separate and identify signature peptides than intact proteins in 
many cases, (ii) the requisite isolation of proteins for reagent preparation and 
identification can be precluded by synthesizing signature peptides identified in 
protein and DNA databases, and (iii) it is easier to tryptic digest all proteins in a 

25 single reaction than to isolate and digest each individually as in the 2D 

electrophoretic approach. 

A five-step protocol was used for production of signature peptides. The 
first step was to select a sample from a particular compartment of organelle. 
Simple methods, such as centrifugal fractionation of organelles, greatly enrich a 

30 sample in the components being examined. The second step embodied 

reduction and alkylation of all proteins in the sample. In some cases the 
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alkylating agent can be affinity labeled to facilitate subsequent selection of 
cysteine-containing peptides. The third step was tryptic digestion of all 
polypeptides in the reduced and alkylated sample. A few to more than a 
hundred peptides will be generated from each protein, depending on solubility 
5 and ease of digestion. Although data are not presented, it was found that trypsin 

will partially digest leather and by so doing generates signature peptides. This 
potentially offers an avenue to the analysis of insoluble proteins. The enormous 
complexity of the sample produced by proteolysis was reduced in a third step by 
using affinity chromatography methods to select peptides with unique structural 

10 features. Affinit y selected peptides were then fractionated by high-resolution 

RPLC in a fourth step. And finally, target peptides from RPLC fractions were 
identified by MALDI-TOF-MS mass in the fifth step. 

The analytical strategy employed in this study focused on the ability of 
Con A lectin columns to select glycopeptides from tryptic digests, RPLC to 

15 further fractionate the selected peptides, and MALDI-TOF-MS to identify 

specific peptides in RPLC fractions. Lectin columns have been widely used to 
purify glycopeptides, generally for the purpose of studying the oligosaccharide 
portion of the conjugate. When characterization of the sugar moiety is the 
object, it is important to fractionate as many of the glycoforms as possible, 

20 either with serial lectin columns, anion-exchange chromatography, or capillary 

electrophoresis. The focus of this work, in contrast, was on the peptide portion 
of the glycoconjugate. Any glycoform containing the signature peptide 
backbone is appropriate for protein identification. Con A has high affinity for 
N-type hybrid and high-mannose oligosaccharides, slightly lower affinity for 

25 complex di-antenary oligosaccharides, and virtually no affinity for complex N- 

type tri- and tetra-antenary oligosaccharides. Most of the N-type glycoproteins 
contain glycoforms that are recognized by Con A. Thus, a Con A column is 
ideal for selecting glycopeptides from digests of N-type glycoproteins. 

Compartmentalization. Protein(s) of interest often residue in a particular 

30 compartment in a cell or organism. The act of first isolating the compartment 

within which the protein is contained can produce a very substantial 
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simplification of the sample. One system chosen for this study was 
glycoproteins in bovine cellular nuclei. 

Glycoproteins in the nuclei of mammalian cells are uniquely different to 
those found in the cytosol. Higher animal cells reversibly O-glycosylate some 
5 nuclear proteins with a single iV-acetyl glucosamine (O-GlcNAc) at a specific 

serine or threonine residue. It is thought that this O-GlcNAc glycosylation is 
associated with transcription factors and is part of a control process; thus it is 
necessary to have enzymes for both glycosylate and deglycosylate in the same 
compartment. It was an objective in this study to gain a rough idea of the 

10 number of these glycoproteins in the nuclei of bovine pancreas cells. 

Subsequent to the isolation of nuclei by centrifugation, histones were 
selectively removed and O-glycosylated proteins isolated as a group by 
chromatography on a Bandeiraea simplicifolia (BS-H) lectin affinity column. 
This lectin is specific for iV-acetyl glucosamine. A silica based BS-II column 

15 was synthesized and coupled with a switching valve to a reversed-phase 

column. This two-dimensional chromatographic system was used to 
concentrate and purify glycoproteins from nuclei. Reversed-phase 
chromatography (Fig. 2) and 2D gel electrophoresis of the protein fraction 
eluted from the lectin column by JV-acetyl-D-glucosamine (0.20 M) confirm the 

20 presence of some 25-35 major components in the sample. More components 

may be present but below the limits of detection. Considering that some 20,000 
proteins may be expressed in mammalian cells, this is much simpler than 
anticipated. The results of this study show that compartmentalization and 
affinity selection of specific proteins from a cell can greatly reduce the number 

25 of proteins in a sample. 

When the protein sample used for glycoprotein analysis was reduced, 
alkylated with iodoacetamide, and trypsin digested before chromatography on 
the (BS-II) lectin affinity column, the reversed-phase chromatogram of the 
glycopeptides captured by the affinity column again shows unexpected 

30 simplicity (Fig. 3). Mass spectra of selected peaks (Fig. 4) indicate a relatively 

low degree of complexity in fractions collected from the reversed-phase column. 
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No attempt was made to identify these peptides by either database searches or 
multidimensional MS. 

Signature peptide selection from serotransferrin. Serotransferrin, i.e., 
transferrin from serum, was chosen as a model protein to examine affinity 
5 selection of affinity peptides. Human serotransferrin is a glycoprotein of M, 

80,000 containing 679 amino acid residues. Potential sites foriST-glycosylation 
are found in the sequence at residues Asn^ and Asn 611 . The reversed-phase 
chromatogram of a tryptic digest (Fig. 5a) is seen to be substantially reduced in 
complexity when non-glycosylated peptides are first removed with a 

10 concanavalin A affinity chromatography column (Fig. 5b). The peptides 

glycosylated at residues Asn 413 and Asn 611 eluted at 27.5 and 33.4% of solvent 
B, respectively. MALDI-MS of the two major components from Fig. 6b are 
seen in Figs. 6a and 6b, respectively. Although the chromatographic peaks 
appear to be homogenous, MALDI-TOF-MS indicates considerable 

15 heterogeneity within the two fractions. This is as expected. It is known that 

there is often substantial heterogeneity in the oligosaccharide portion of a 
glycopeptide. The stationary phase of the reversed-phase column interacts 
almost exclusively with the peptide region of glycopeptides, essentially ignoring 
the oligosaccharide portion. This means that glycopeptides which are 

20 polymorphic in the oligosaccharide part of the molecule will produce a single 

chromatographic peak, albeit slightly broader than that of a single species. On 
the other hand, MALDI-TOF-MS discriminates on the basis of mass and detects 
all species that differ in mass without regard to structure. Used together, these 
two methods produce a high degree of structural selectivity. 

25 Identification of serotransferrin signature peptides from serum. Based 

on the solvent composition known to elute the serotransferrin glycopeptides and 
their mass spectra, an experiment was undertaken to identify these signature 
peptides in a tryptic digest of human serum proteins. Chromatograms in 
Figs. 7a and 7b show the enormous complexity of the glycopeptide mixture 

30 selected from a tryptic digest of human serum by a Con A affinity 

chromatography column. Fractions eluting between 27 and 28% and between 
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33 and 34% were collected from the reversed-phase column and their mass 
spectra compared with that of human serotransferrin. Although extremely 
complex, mass spectra (Figs. 6a and 6b) obtained from fractions corresponding 
in chromatographic properties to the serotransferrin glycopeptides reveal the 
5 presence of these signature peptides in the serum sample. Fig. 8a shows masses 

at 3861, 4163 and 4213 u, matching the glycopeptide peaks from Fig. 6a. Mass 
error was typically <4 u using external calibration. Because of the relatively 
lower amount of the human transferrin in an individual's serum, higher laser 
power was used to generate the spectra than that in pure human transferrin. 

10 Therefore, peak intensity were lower and spectral resolution were lower. In 

order to increase signal to noise ratio, all the spectra were smoothed by a 19- 
point averaging process. This caused the mass error to be a little higher. 
Glycoforms at 3459, 3614 and 3895 u were either absent or ion suppressed 
sufficiently so that they could not be seen. We also checked the fraction from 

15 25 to 27% and from 29 to 31%, there was no more than one peak matching 

glycopeptide peaks from Fig. 6a. It demonstrated that the matching of these 
peaks were not coincident. Fig. 8b shows that 4595, 4634, 4710 and 4753 
matched the glycopeptides peaks from Fig. 6b. Again, fractions from 31 to 33% 
and 34 to 36% were checked and no matching was found. The fact that the 

20 spectra are not identical in relative intensities to the standards can be explained 

by possible reasons: differences in glycosylation ratio between the reference 
protein and that in the serum sample of an individual; inter-run variations in 
MALDI spectra resulting from difference in MALDI ionization. 

Although not examined, other modes of selection are also potentially 

25 possible. A variety of lectins are available that allow the selection of specific 

types of post-translational modification on the basis of oligosaccharide 
structure. Antibodies would be another way to select for specific types of post- 
translational modification such as phosphorylation. Antibodies have also been 
used to select dinitrophenyl derivatized amino acids, such as tryptophan. 

30 Alkylation of cysteine with a biotinylated form of maleimide has been 

suggested as another way to select cysteme-containing peptides with avidin. 
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Perhaps double selection by a combination of these affinity methods will give 
even higher degrees of selectivity. 

It is concluded that signature peptides derived from tryptic digests of 
complex protein mixtures can be used as analytical surrogates, at least in the 
5 case of glycoproteins. Even in the case of samples with the complexity of 

human serum, the multidimensional analytical approach of affinity 
chromatography, reversed-phase chromatography and mass spectrometry has 
sufficient resolution to identify single signature peptide species. Because the 
whole protein is not needed for analysis, this strategy is particularly suited to the 
10 identification of proteins of limited solubility or that are suggested from DNA 

data bases but have never been isolated. 

Example II. Sample Protocol for Analysis of Protein Mixtures 

1 5 The following protocol is one of many according to the invention that 

are useful for analyzing complex protein mixtures. 

Step 1 . Reduction of entire sample containing several thousand proteins 
in a robotic sample handling system. 
20 Step 2 . Alkylate sulfhydryl groups. If cysteine selection is desired the 

alkylating reagent is an affinity tagged maleimide. If the selection will be for 
another amino acid, the alkylating agent will be iodoacetic acid or 
iodoacetamide. 

Step 2' . If another amino acid is to be affinity selected, such as tyrosine, 
25 that derivatizing agent is added at this time. 

Step 3. Proteolysis; generally with trypsin, but any proteolytic enzyme 
or combination of enzymes could be used. Enzymatic digest could either be 
done in the robotic system or with an immobilized enzyme column. 

Step 4 . An affinity sorbent is used to adsorb affinity tagged species. 
30 Non-tagged peptide species are eluted to waste. 

Step 5 . Tagged species are desorbed from the affinity sorbent. 
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Step 6 . Tagged species are chromatographically resolved. In the 
simplest case the sample is subjected to high resolution reverse phase 
chromatography (RPC) only. Still higher resolution can be achieved by using 
two dimensional chromatography. Step gradient elution ion exchange 
5 chromatography with RPC of each fraction is a good choice. Given that the ion 

exchange column could split the tagged species into 50 fractions and the RPC 
column had a peak capacity of 200, it is possible to generate 10,000 fractions for 
MALDI. It is estimated that the total number of sulfhydryl containing peptides 
would not exceed 20,000. This would mean that no sample would contain more 
10 than 2-10 peptides. MALDI should be very capable of handling 1-30 peptides 

per sample. 

Step 7 . Samples are collected from the chromatographic system and 
transferred directly to the MALDI plates. Alternatively, if the sample is not too 
complex, analytes are electrosprayed directly into an ESI-MS. 

15 
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Example HI. Representative Amino Acid Derivatizations 

1. Tryptophan can be derivatized with 2,4-dinitrophenylsulfenyl 
chloride. (Biochem. Biophys. Acta. 278, 1 (1972)]. Reaction conditions: 50% 
5 acetic acid, 1 hour, room temperature. Selection is based on dinitrophenyl- 

directed antibodies. 




25 
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2. Cysteine can derivatized with an affinity tagged maleimide. Normal 
and deuterium labeled tags are mixed so that tagged species are easily identified 
in the MALDI spectrum as a doublet that is three mass units apart. 



10 




15 For example, cysteine residue in a polypeptide can be derivatized with 

affinity tagged D 2 -maleimide. Here, the affinity tag is peptide R-R.,. 



20 



\h-ch-co-nh-ch-co-nh-ch-cooh 



^2 



. Ri R2 CH 2 R 3 R4 

H-NH-CH-CO-NH-CH-CO-NH-CH-CO-NH-CH-CO-NH-CH-CO-NH- 



NH 2 
(CH 2 )3 
[-CH-COOH 



25 
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3. Cysteine can alternatively derivatized with 2,4-dinitrobenzyl 
chloride. Conditions: pH 5, 1 hour, room temperature. 
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4. Methionine can be derivatized under acidic conditions. This 
derivatizing agent also derivatizes histidine at pH 5. The substantial ionization 
of histidine at pH 3 apparently diminishes its alkylation. In view of the fact that 
histidine reacts with this reagent, it is preferable to remove histidine peptides 
5 with IMAC before derealization. 



10 



NIHCH 2 CH 2 NH-CO-CH 2 -Br 
N0 2 

CH 2 CH 2 -S-CH 3 

-NH-CH-CO-NH- 

pH 3, 24 hr. at R.T. 
8 M urea 
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NHCH 2 CH 2 NH-CO-CH 2 

^02 +J. 




N0 2 



-CH 3 



-NH 



CH 2 

CH 2 

J 

i-CH-CO-NH- 



20 



Example IV. 

Advantages and Disadvantages of Selective Capture of Specific Amino Acids 
1. Cysteine 
25 a. Biotinylation of maleimide. 

Positives — very high affinity capture. Avidin columns are 

readily available. 

Negatives — it takes very acidic conditions to release from 
columns. A large molecule (avidin) is being used to capture a 
30 small molecule, thus a large column is needed to capture enough 

peptide for analysis. 
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b. Histidine labeling of maleimide. 

Positives — very simple columns may be used that are of high 
capacity. 

Negatives — non-cysteine containing peptides in the digest that 
5 also contain histidine will also be selected. In addition, the mass 

starts to get a little high. 

c. Peptide labeling and antibody (Ab) capture. 

Positives — very high capture efficiency. Easy to release 
10 captured peptide. 

Negatives — a large molecule (Ab) is being used to capture a 
small molecule, thus a large and expensive column is needed to 
capture enough peptide for analysis. 

15 d. Dinitrophenylation. 

Positives — very simple organic chemistry. Antibody capture is 
very efficient. 

Negatives — a large molecule (Ab) is being used to capture a 
small molecule, thus a large and expensive column is needed to 
20 capture enough peptide for analysis. It is also difficult to heavy 

isotope label 2,4-DNP. 



2. Tryptophan. 

a. Dinitrophenylation. 
25 Positives — very simple organic chemistry. Antibody capture is 

very efficient. 

Negatives — a large molecule (Ab) is being used to capture a 
small molecule, thus a large and expensive column is needed to 
capture enough peptide for analysis. It is also difficult to heavy 
30 isotope label 2,4-DNP. 
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3. Methionine. 

a. Dinitrophenylation. 

Positives — very simple organic chemistry. Antibody capture is 
very efficient. 

5 Negatives — a large molecule (Ab) is being used to capture a 

small molecule, thus a large and expensive column will be 
needed to capture enough peptide for analysis. It is also difficult 
to heavy isotope label 2,4-DNP. 

10 b. Histidine labeling. 

Positives — very simple columns may be used that are of high 
capacity. 

Negatives — non-cysteine containing peptides in the digest that 
also contain histidine will also be selected. In addition, the mass 
15 starts to get a little high. 

c. Peptide labeling and antibody capture. 

Positives — very high capture efficiency. Easy to release 
captured peptide. 

20 Negatives — a large molecule (Ab) is being used to capture a 

small molecule, thus a large and expensive column is needed to 
capture enough peptide for analysis. 

d. Biotinylation. 

25 Positives — very high affinity capture. Avidin columns are 

readily available. 

Negatives — it takes very acidic conditions to release from 
columns. A large molecule (avidin) is being used to capture a 
small molecule, thus a large column is needed to obtain enough 
30 peptide for analysis. 
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4. Tyrosine. 

a. Nitrophenylation and antibody capture. 

Positives — very simple organic chemistry. Antibody capture is 
very efficient. 

5 Negatives — a large molecule (Ab) is being used to capture a 

small molecule, thus a large and expensive column is needed to 
capture enough peptide for analysis. It is also difficult to heavy 
isotope label NP. 

10 b. Reaction with diazonium salts to form wide variety of 

derivatives. 

Positives — simple reaction that is well known. 
Negatives — very hydrophobic group, affinity tag must be 
attached, cross reacts with other amino acids. 



15 



5. Histidine. 

a. Capture with an IMAC column. 
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Example V. Sample Post-Digestion Secondary Labeling Protocol 

RPC 

5 T 

select for cysteine second (exogenous affinity ligand) 
T 

RPC •*- select for glycosylation, phosphopeptides or histidine first (endogenous affinit y 

T 

10 Protein -* reduced protein ~+ alkylated protein* -* tryptic peptides 

J 

affinity select for cysteine first - secondary affinity labeling 

(exogenous affinity ligand) (tryptophan, methionine or tyrosine) 

1 1 
15 affinity select for tryptophan, select for tryptophan, methionine, or tyrosine first 

methionine or tyrosine (exogenous affinity ligands) 
second 

I I 

RPC RPC 

20 



25 

* Affin ity labeling cysteine residues is optional. It should be noted, however, that 
cysteine must be alkylated at this point and if it is not affinity labeled during reduction, 
it can never be labeled. 
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Example VI. Sample Pre-Digestion Labeling Protocol 



RPC 

5 t 

select for cysteine, tryptophan, . 
methionine, or tyrosine second 

T 

RPC glycosylation, histidine or cysteine 

T 

10 Protein ->■ reduced alkylated* secondary affinity labeling ~+ proteolysis 

I I 

select for cysteine first Select for tryptophan, 
I methionine, or tyrosine 

15 affinity select for tryptophan, 1 

methionine or tyrosine second RPC 

1 

RPC 



20 



25 



* Affinity labeling cysteine residues in this case is optional. It should be noted, 
however, that cysteine must be alkylated at this point and if it is not affinity 
labeled during reduction it can never be labeled. 
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Example VII. 
Isotopically Labeled Internal Standard Quantification 

One of the issues with the signature peptide approach is how to 
quantitate the protein being identified. Because tryptic digests of samples 
containing many proteins are enormously complex, the mixture generally will 
not be resolved into individual components by reversed-phase chromatography. 
Simple absorbance monitoring is precluded. This will even be true with affinity 
selected samples as was seen in Figs. 3 and 7. Figs. 7a and 7b shows that there 
can be so many components in reversed-phase chromatograms of affinity 
selected samples that quantification of any particular peptide is impossible. The 
next avenue to quantification would be to use peak height in the MALDI-TOF 
spectrum. Unfortunately, MALDI-TOF is not very quantitative. Abetter 
method is needed. 

Internal standards are frequently used in quantitation. The internal 
standard method of quantification is based on the concept that the concentration 
of an analyte in a complex mixture of substances may be determined by adding 
a known amount of a very similar, but distinguishable substance to the solution 
and deterrnining the concentration of analyte relative to a known concentration 
of the internal standard. Assuming that the relative molar response of the 
detection system for these two substances (0t/R) can be determined, then A = 
A[9t/R]A. The term A is the instrument response to analyte, A is instrument 
response to the internal standard, R is specific molar response to analyte, SI is 
specific molar response to the internal standard, and A is the relative 
concentration of analyte to that of the internal standard. It is important that 
these substances are as similar as possible in chemical properties so they will 
behave the same way in all the steps of the analysis. In view of the fact that the 
last step of the analytical protocol used to identify signature peptides is MS, 
isotopic labeling of either the internal standard or the analyte would be the best 
way to produce an internal standard. Chromatographic systems are generally 
not able to resolve isotopic forms of an analyte whereas isotopically labeled 
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species are easily resolved by MS. Behavioral equivalency in all stages except 
MS is critical. The question is how to easily create isotopically labeled internal 
standards of peptides in mixtures. 

This may be done in two ways. One is through the synthesis of peptides 
5 in which one of the amino acids is labeled. The second is by derivatizing 

peptides with an isotopically labeled reagent. Although it is more lengthy, the 
second route was chosen because it can also be used to create internal standards 
of unknown structures. This is critical in proteomic studies where the object is 
to identify unknown proteins in regulatory flux. 

1 0 Data are presented that suggest proteins may indeed be quantified as 

their signature peptides by using isotopically labeled internal standards. 
Signature peptides generated by trypsin digestion have a primary amino group 
pat their arrimo-terminus in all cases except those in which the peptide 
originated from the blocked amino-terminus of a protein. The specificity of 

1 5 trypsin cleavage dictates that the C-terminus of signature peptides will have 

either a lysine or arginine (except the C-terminal peptide from the protein) and 
that in rare cases there may also be a lysine or arginine adjacent to the C- 
terminus. Primary amino groups of peptides were acylated withiV- 
hydroxysuccinimide. 

20 When analyzed by MALDI-MS in the positive ion mode, it is seen (Fig. 

9) that a peptide with five amino groups (KNNQKSEPLIGRKKT; SEQ ID 
NO:l) can be quantitatively derivatized with this reaction. Internal standard 
peptides are acetylated with the trideuteroacetylated analogue of N- 
hydroxysuccinimide. This means that peptides in samples containing both the 

25 native and deuterated internal standard species (FLSYK; SEQ ID NO:2) would 

appear in the mass spectrum as a doublet (Fig. 10a). The presence of a carboxyl 
group in all tryptic peptides allows them to be analyzed by MALDI-TOF-MS in 
the negative ion mode. It was found that the e-amino group of all lysines can be 
derivatized in addition to the amino-terminus of the peptide, as expected. 

30 Arginine residues are not acetylated. This means that 3 amu would be added for 

each lysine when using trideutero-N-hydroxysuccinimide. The number of 
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lysines in a peptide is revealed by the mass shift. (Multiple basic amino acids 
occasionally occur at the C-terminus with trypsin.) It is also possible to 
differentiate between peptides in which the only basic amino acid is lysine, or 
arginine, or a combination of the two. Peptides in which the only basic amino 

5 acid is lysine have no positive charge after acetylation. No spectra will be 

produced in the positive ion mode of ion acceleration unless a cationizing agent 
is added to the peptide. Actually, the peptide in this case picks up sodium and 
potassium ions from the matrix in the MALDI source, causing an increase in 
mass equivalent to that of sodium or potassium. Because the mass of these two 

10 ions is different, they appear in the spectrum as a double. When coupled with 

the fact that the lysine peptide described above in Fig. 10a is also deuterated, the 
mass spectrum of this peptide in the positive ion mode of acceleration will show 
four peaks (Fig. 10b). 

The mass spectrum for any peptide in a sample containing an 

15 isotopically labeled internal standard will appear as at least a doublet. The 

simplest case would be the one where (i) trideutero-NAS was used as the 
labeling agent, (ii) the C-terminus was arginine, and (iii) there were no other 
basic amino acids in the peptide. Spectra in this case show a doublet in which 
the two peaks are separated by 3 u (Fig. lib). With one lysine the doublet 

20 peaks were separated by 6 u (Fig. 11a) and with two lysine by 9 u. For each 

lysine that is added the difference in mass between the experimental and control 
would increase an additional 3 u. Quantification of the relative amounts of both 
lysine and arginine containing peptides using MALDI-TOF and isotopically 
labeled internal standards was studied. A linear equation was deduced from the 

25 ion current intensity ratio of deuterium-labeled and unlabeled acetylated 

peptides versus the known ratio of the amount of these two peptides. The 
equation of the arginine-containing peptide (TAGFLR; SEQ ID NO:3) wasy = 
0.9509* - 0.3148 (R 2 =0.9846) while that for a lysine-containing peptide 
(FLSYK; SEQ ID NO:2) was y = 0.9492* + 0.4112 (R 2 =0.9937). The term y 

30 stands for the intensity ratio of the deuterium-labeled to unlabeled acetylated 

peptides and x stands for the relative amount of these two peptides. 
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These results strongly suggest that a method in which internal standard 
peptides are created by isotopic labeling and ratios of native to internal standard 
species quantified by MS will be useful in determining the relative 
concentration of signature peptides. 
5 It is concluded that isotopically labeled internal standard analysis 

provides a useful method for the quantification of peptides. There is a strong 
possibility that when coupled with signature peptide derived from proteins, 
these combined methods will provide a powerful new method for the 
quantification of multiple proteins in complex mixtures. 

10 

Example VIIL Sample Protocol for Analysis of Protein Expression 

The following protocol is one of many according to the invention that 
are useful for analyzing protein expression levels. 
15 Step 1. Reduction of control and experimental samples containing 

several thousand proteins in robotic sample handling system. 

Step 2. Alkylate sulfhydryl groups in experimental sample. If cysteine 
selection is desired the alkylating reagent is an affinity tagged maleimide. If the 
selection will be for another amino acid, the alkylating agent is iodoacetic acid 
20 or iodoacetamide. 

Step 2'. Alkylate sulfhydryl groups in the control sample. If cysteine 
selection is desired, the alkylating reagent is a heavy isotope affinity tagged 
maleimide. If the selection will be for another amino acid, the alkylating agent 
is heavy isotope labeled iodoacetic acid or iodoacetamide. This allows proteins 
25 originating from the experimental sample to be distinguished from those 

originating from the control sample. 

Step 3. The experimental and isotopically labeled control samples are 
combined. 

Step 4. The proteins are separated by 2-D electrophoresis or 2-D 
30 chromatography. Reduction and alkylation may destroy tertiary and quaternary 

structure of the proteins. This would have a large impact on electrophoresis and 
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chromatography, but the results could still be extrapolated to the native protein 
sample. 

Step 5. Purified or partially purified proteins are subjected to 
proteolysis; generally with trypsin, but any proteolytic enzyme or combination 
5 of enzymes could be used. Enzymatic digest would either be done in a robotic 

system or with an immobilized enzyme column. 

Step 6. Digested samples are transferred directly to the MALDI plates. 

Example IX. Use of Fragment Ions to Distinguish Isobaric Peptides 

10 

A C-terminal arginine containing peptide (NH 2 -H-L-G-L-A-R-OH; 
lmg) (SEQ ID NO:4) was dissolved in 1ml of 0.1M phosphate buffer pH 7.5. 
This solution was then divided into two equal parts (500ul each). One part was 
acetylated with N- 0H 3 ) acetoxysuccinirnide and the other was with N- ( 2 H 3 ) 
15 acetoxysuccinirnide. Both parts were then mixed and purified on a CIS- 

reversed phase column (RPC). Fractions from the RPC were collected and 
subjected to ESI-MS/MS. The singly charged precursor ion isotope cluster of 
m/z 708.50/71 1.50 [M+H] was isolated and subjected to collision-activated 
dissociation (CAD). 

20 The tandem mass spectrum given by the CAD of singly charged 

differentially acetylated precursor ion isotope cluster of Ac-HLGLAR-OH (m/z 
708.50/71 1.50) (SEQ ID NO:4) yields fragment ions listed in Table 1. BothN- 
and C-terrninal fragment ions of type a, b and y are present in this spectrum. 
Complete b n or y n ion series are not seen in this spectrum. All prominent N- 

25 terminal fragment ions (a and b type) appeared as isotope clusters, separated by 

3 amu. In contrast, all C-terminal ( y-type ) fragment ions are not seen as 
isotope clusters separated by 3 amu; rather they coincide, since these ions do 
not contain an acetyl group. Isotope ratios of all b-ions were detennined by the 
peak heights of acetylated form divided by the peak heights of 

30 trideuteroacetylated form. For example relative abundance (peak height) of m/z 

534.1 divided by the relative abundance of m/z 537.2 was used to get the ratio 
1.07 of b5 ion (see Tables 1 and 2). Fragment ions y5-y2 confirms the N- 
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terminal sequence of Ac-H-L-G-L (SEQ ID NO:5), whereas fragment ions b5- 
b2 confirms the C-terrninal sequence of G-L-A-R-OH (SEQ ID NO:6). 

It is evident that the isotope labeling ratios carry through from the 
precursor ion to the fragment ions. This differential labeling can be used to 
5 achieve relative quantification of peptides by tandem mass spectrometry in 

proteomics. This also permits multiple precursor ions haying the same mass 
("isobaric peptides") to be readily distinguished and quantified after CAD of the 
parent ion in this second mass spectrometry dimension. 
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Table 2: Statistical analysis of fragment ion ratios of differentially acetylated 
peptide NH 2 -H-L-G-L-A-R-OH (SEQ ID NO:4) 

5 Fragment ions Experimental Mean+/-SD Expected % Error 

ratio ratio 

M-NH 3 9.6/9.0=1.07 1.0 

M-H 2 0 7.54/7.5=1.0 1.0 

M-H 2 0-17 0.64/0.61=1.05 1.0 

10 M-H 2 0-Ac 7.97/7.61=1.05 1.0 

b5+H 2 0 2.4/2.3=1.04 1.08±±0.060 1.0 8.0 

b5 8.5/7.97=1.07 1.0 

b4 8.68/8.1=1.07 1.0 

b3 4.6/4.1=1.12 1.0 

15 b2 1.44/1.2=1.20 1.0 

a4 2.65/2.29=1.16 1.0 



The complete disclosures of all patents, patent applications including 
20 provisional patent applications, and publications, and electronically available 

material cited herein are incorporated by reference. The foregoing detailed 
description and examples have been provided for clarity of understanding only. No 
unnecessary limitations are to be understood therefrom. The invention is not 
limited to the exact details shown and described; many variations will be apparent 
25 to one skilled in the art and are intended to be included within the invention defined 

by the claims. 
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WHAT IS CLAIMED IS: 

1 . A method for analyzing differences in protein content among plural 
protein samples, the method comprising: 

5 fragmenting at least a first protein sample and a second protein sample 

to produce a first peptide pool and a second peptide pool; 

isotopically labeling at least a portion of the peptides in at least one of 
the pools so as to permit resolution of otherwise identical peptides in the first 
and second peptide pools by mass analysis; 
1 0 contacting peptides from at least a portion of both of the peptide pools 

with a capture moiety to yield affinity-selected peptides comprising an affinity 
ligand, wherein the capture moiety selects for the affinity ligand; and 

analyzing the affinity-selected peptides by mass spectrometry to 
determine one or more differences between the first and second samples. 

15 

2. The method of claim 1 wherein the labeling step comprises labeling at least 
one of the N-tennini or the C-termini of the portion of the peptides. 

3. The method of claim 2 wherein the labeling step comprises labeling both the 
20 N-termini and the C-termini of the portion of the peptides. 

4. The method of claim 2 wherein the affinity ligand is an endogenous affinity 
ligand. 

25 5. The method of claim 1 wherein the affinity ligand does not comprise the 

isotope label. 

6. The method of claim 1 further comprising combining at least portions of 
the first and second pools after the labeling step but prior to the analyzing step. 

30 
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7. The method of claim 1 wherein the affinity ligand is endogenous. 

8. The method of claim 7 wherein the endogenous affinity ligand 
comprises an antigen. 

5 

9. The method of claim 8 wherein the affinity ligand comprises at least one 
antigen selected from the group consisting of a sugar, a lipid, a glycolipid and a 
peptide. 

10 10. The method of claim 1 further comprising chemically coupling the 

affinity ligand to peptides. 

1 1 . The method of claim 1 further comprising reducing and alkylating the 
protein samples prior to the fragmenting step. 

15 

12. The method of claim 1 wherein the affinity-selected peptides comprise 
at least one low abundance amino acid selected from the group consisting of 
cysteine, tryptophan, nistidine, methionine and tyrosine. 

20 13. The method of claim 1 wherein the affinity-selected peptides comprise 

at least one phosphate group. 

14. The method of claim 13 wherein the affinity-selected peptides comprise 
at least one amino acid selected from the group consisting of 

- 25 phosphotyrosine, phosphoserine and phosphothreonine. 

15. The method of claim 1 wherein the affinity-selected peptides comprise 
at least one oligosaccharide. 
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1 6. The method of claim 1 further comprising, prior to the analysis step, 
contacting the affinity-selected peptides with a second capture moiety to yield a 
subset of affinity-selected peptides comprising a second affinity ligand, wherein 
the capture moiety selects for the second affinity ligand. 

5 

17. The method of claim 1 6 wherein the second affinity ligand is an 
endogenous ligand. 

1 8 . The method of claim 1 7 wherein the first affinity ligand comprises the 
10 isotope label. 

1 9. The method of claim 1 further comprising fractionating the affinity- 
selected peptides prior to analysis. 

1 5 20. The method of claim 1 9 wherein the fractionation technique is selected 

from the group consisting of reversed phase chromatography, ion 
exchange chromatography, hydrophobic interaction chromatography, 
size exclusion chromatography, capillary gel electrophoresis, capillary 
zone electrophoresis and capillary electrochromatography, capillary 

20 isoelectric focusing, immobilized metal affinity chromatography and 

affinity electrophoresis. 

2 1 . The method of claim 1 further comprising fractionating the peptides 
subsequent to the contacting step to produce a second subset of peptides for 

25 mass spectrornetric analysis. 

22. Hie method of claim 1 wherein the mass spectrornetric analysis is 
selected from the group consisting of matrix assisted laser desorption ionization 
(MALDI), electrospray ionization (ESI), fast atom bombardment (FAB), 

30 electron impact ionization, atmospheric pressure chemical ionization (APCI), 
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time-of-flight (TOF), quadrapole, ion trap, magnetic sector, ion cyclotron 
resonance mass, or combinations thereof. 

23. The method of claim 1 wherein the labeling step comprises labeling the 
5 first peptide pool with a first isotopic variant of a chemical moiety and the 

second peptide pool with a second isotopic variant of the chemical moiety to 
yield peptides in the first and second pools that are chemically equivalent but 
isotopically distinct; and wherein the analyzing step comprises analyzing the 
first sample and second samples by mass spectrometry; and comparing the mass 
10 spectrometry of the first and second samples. 

24. The method of claim 23 wherein the analyzing step further comprises: 
generating a first isotope ratio for the samples labeled with the first 

isotopic variant; 

15 generating a second isotope ratio for the samples labeled with the second 

isotopic variant; 

comparing the first isotope label ratio with the second isotope label 
ratio, wherein a difference between the first isotope label ratio with the second 
isotope label ratio is indicative of a difference in the relative concentration of 
20 the labeled peptides in the first and second sample. 

25. The method of claim 24 wherein the first and second samples are 
combined prior to the analyzing step. 

25 26. A method for detecting a difference in the concentration of -a protein 

originally present in a first sample and in a second sample, each sample 
comprising a plurality of peptides derived from fragmentation of proteins 
originally present in the sample, the method comprising: 
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covalently attaching a first isotopic variant of a chemical moiety to at 
least one of an amino group and a carboxyl group of a peptide in the first 
sample to yield at least one first isotopically labeled peptide; 

covalently attaching a second isotopic variant of the chemical moiety to 
5 the at least one of a free amino group and a free carboxyl group of a peptide in 

the second sample to yield at least one second isotopically labeled peptide, 
wherein the first and second isotopically labeled peptides are chemically 
equivalent yet isotopically distinct; 

mixing at least portions of the first and second samples to yield a 
10 combined sample; and 

subjecting the combined sample to mass spectrometric analysis to 
determine a normalized isotope ratio characterizing peptides derived from 
proteins whose concentration is the same in the first and second samples and an 
isotope ratio of the first and second isotopically labeled peptides, wherein a 
1 5 difference in the isotope ratio of the first and second isotopically labeled 

peptides and the normalized isotope ratio is indicative of a difference in 
concentration in the first and second samples of a protein derived from the 
peptide. 

20 27 . A method for detecting a difference in the concentration of a protein 

present in a first sample and in a second sample, each sample comprising a 
plurality of proteins, the method comprising: 

fragmenting proteins in the first and second samples to yield at least one 
peptide in each sample; 

25 covalently attaching a first isotopic variant of a chemical moiety to at 

least one of an amino group and a carboxyl group of a peptide in the first 
sample to yield at least one first isotopically labeled peptide; 

covalently attaching a second isotopic variant of the chemical moiety to 
the at least one of a free amino group and a free carboxyl group of a peptide in 

30 the second sample to yield at least one second isotopically labeled peptide, 



WO 01/86306 



81 



PCT/US01/14418 



wherein the first and second isotopically labeled peptides are chemically 
equivalent yet isotopically distinct; 

mixing at least portions of the first and second samples to yield a 
combined sample; and 
5 subjecting the combined sample to mass spectrometric analysis to 

determine a normalized isotope ratio characterizing peptides derived from 
proteins whose concentration is the same in the first and second samples and an 
isotope ratio of the first and second isotopically labeled peptides, wherein a 
difference in the isotope ratio of the first and second isotopically labeled 
10 peptides and the normalized isotope ratio is indicative of a difference in 

concentration in the first and second samples of a protein derived from the 
peptide. 

28. The method of claim 27 wherein the first and second chemical moieties are 
15 attached to at least one amino group on peptides in the first and second samples. 

29. The method of claim 27 wherein each member of at least one pair of 
chemically equivalent, isotopically distinct peptides comprises at least one 
affinity ligand, the method further comprising, prior to determining the isotope 

20 ratios, contacting the peptides with a capture moiety to select peptides 

comprising the at least one affinity ligand. 

30. The method of claim 29 further comprising subjecting the selected peptides 
comprising the at least one affinity ligand to mass spectrometric analysis to 

25 detect at least one peptide; and identifying the protein from which the detected 

peptide was derived. 

31. The method of claim 30 wherein the detected peptide is a signature peptide 
for a protein, the method further comprising detemiining the mass of the 
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signature peptide and using the mass of the signature peptide to identify the 
protein from which the detected peptide was derived. 

32. The method of claim 30 further comprising detenniriing the amino acid 
5 sequence of the detected peptide and using the amino acid sequence of the 

detected peptide to identify the protein from which the detected peptide was 
derived. 

33. The method of claim 29 further comprising subjecting the selected peptides 
10 comprising the at least one affinity Hgand to mass spectrometric analysis to 

determine peak intensities; and quantitating isotope ratios from the peak 
intensities. 

34. The method of claim 29 further comprising, prior to contacting the peptides 
1 5 with the capture moiety, covalently attaching at least one affinity ligand to at 

least one peptide derived from the fragmentation of the proteins. 

35. The method of claim 29 further comprising, prior to fragmenting the 
proteins, covalently attaching at least one affinity ligand to at least one protein 

20 in the sample. 

36. The method of claim 35 further comprising reducing and ablating the 
proteins with an alkylating agent prior to fragmenting the proteins. 

25 37. The method of claim 36 wherein the at least one affinity ligand is 

covalently attached to the alkylating agent. 

38. The method of claim 35 wherein the at least one affinity ligand is 
covalently attached to an amino acid of the peptide selected from the group 
30 consisting of cysteine, tyrosine, tryptophan, histidine and mettaonine. 
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39. The method of claim 35 wherein the affinity ligand comprises a moiety 
selected from the group consisting of a peptide antigen, a polyhistidine, a biotin, 
a dinitrophenol, an oligonucleotide and a peptide nucleic acid. 

5 40. The method of claim 29 wherein at least one peptide comprises an 

endogenous affinity ligand. 

41 . The method of claim 40 wherein the endogenous affinity ligand comprises 
a moiety selected from the group consisting of a cysteine, a histidine, a 

10 phosphate group, a carbohydrate moiety and an antigenic amino acid sequence. 

42. The method of claim 29 comprising attaching a plurality of affinity ligands, 
each to at least one protein or peptide, and contacting the peptides with a 
plurality of capture moieties to select peptides comprising at least one affinity 

15 ligand. 

43. The method of claim 27 wherein the proteins are fragmented using an 
enzyme selected from the group consisting of trypsin, chymotrypsin, gluc-C, 
endo lys-C, pepsin, papain, proteinase K, carboxypeptidase, calpain and 

20 subtilisin. 

44. The method of claim 27 further comprising fractionating the peptides prior 
to determining the isotope ratios. 

25 45. The method of claim 44 wherem fractionating the peptides comprises 

subjecting the peptides to at least one separation technique selected from the 
group consisting of reversed phase chromatography, ion exchange 
chromatography, hydrophobic interaction chromatography and size exclusion 
chromatography, capillary gel electrophoresis, capillary zone electrophoresis, 
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and capillary electrochromatography, capillary isoelectric focusing, 
immobilized metal affinity chromatography and affinity electrophoresis. 

46. The method of claim 27 wherein the sample comprises at least about 100 
5 proteins. 

47. The method of claim 27 wherein using the mass of the signature peptide to 
identify the protein from which the signature peptide was derived comprises 
comparing the mass of the signature peptide with the masses of reference 

1 0 peptides derived from putative proteolytic cleavage of a plurality of reference 

proteins in a database, wherein at least one reference peptide comprises at least 
one affinity ligand. 

48. The method of claim 47 wherein peptides derived from proteolytic cleavage 
15 of the plurality of reference proteins are, prior to comparing the mass of the 

signature peptide with the masses of the reference peptides, computationally 
selected to exclude reference peptides that do not contain an amino acid upon 
which the affinity selection is based. 

20 49. The method of claim 27 wherein the protein is in regulatory flux in 

response to a stimulus, wherein the first sample is obtained from the biological 
environment before application of the stimulus and the second sample is 
obtained from the biological environment after application of the stimulus. 

25 50. The method of claim 27 wherein the first and second samples are obtained 

from different organisms, cells, organs, tissues or bodily fluids, the method 
further comprising determining differences in concentration of at least one 
protein in the organisms, cells, organs, tissues or bodily fluids from which the 
samples were obtained. 

30 
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5 1 . The method of claim 27 further comprising identifying a plurality of 
isotopically labeled proteins having substantially the same isotope ratios, 
wherein the existence of said plurality of isotopically labeled proteins is 
indicative that the proteins are co-regulated. 

5 

52. The method of claim 27 further comprising identifying a plurality of 
isotopically labeled peptides having substantially the same isotope ratios, 
wherein the existence of said plurality of isotopically labeled peptides is 
indicative that the peptides are derived from the same protein, or from proteins 

10 that are co-regulated. 

53. The method of claim 27 wherein the samples are obtained from a biological 
environment, and wherein the first sample is obtained from the biological 
environment before application of a stimulus and the second sample is obtained 

1 5 from the biological environment after application of the stimulus. 

54. A method for determining whether a protein is present in one sample but 
not in another sample, each sample comprising a plurality of proteins, the 
method comprising: 

20 providing a first and second sample, wherein the first sample is obtained 

from a biological environment prior to the application of a stimulus and the 
second sample is obtained from the biological environment after the application 
of the stimulus; 

fragmenting proteins in the first and second samples to yield peptides; 
25 partitioning the first sample into a first subsample and a second 

subsample; 

contacting the peptides in the first subsample with a first acylating agent 
comprising a first isotope; 

contacting the peptides in the second subsample with a second acylating 
30 agent comprising a second isotope; 
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contacting the peptides in the second sample with a third acylating agent 
comprising a third isotope, wherein the first, second and third acylating agents 
are chemically equivalent yet isotopically distinct; 

mixing at least portions of the first and second subsamples and the 
5 second sample to yield a combined sample; 

fractionating the peptides in the combined sample to yield a plurality of 
peptide tractions amenable to mass spectrometric isotope ratio analysis; and 

subjecting at least one peptide fraction to mass spectrometric isotope 
ratio analysis, wherein the presence of a doublet indicates the absence of the 
1 0 protein in the second sample and the presence of a single peak indicates the 

absence of the protein in the first sample. 

55. A method for quantifying a peptide comprising: 

subjecting a sample comprising isotopically labeled isobaric peptides to 
15 mass spectrometric analysis to yield fragment ions, wherein at least two of the 

fragment ions are isotopically labeled and differ in mass with respect to each 
other; and 

determining the isotope ratio of the at least two fragment ions, wherein 
the isotope ratio is indicative of the relative quantities of the isobaric peptides in 
20 the sample. 

56. A method for quantifying a peptide comprising: 

subjecting a sample comprising isotopically labeled peptides to a first 
mass spectrometric analysis to identify a plurality of isobaric peptides; 
25 subjecting the plurality of isotopically labeled isobaric peptides to a 

second mass spectrometric analysis to yield fragment ions, wherein at least two 
of the fragment ions are isotopically labeled and differ in mass with respect to 
each other; and 
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determining the isotope ratio of the at least two fragment ions, wherein 
the isotope ratio is indicative of the relative quantities of the isobaric peptides in 
the sample. 

5 57. A method for quantifying a peptide comprising: 

subjecting a sample comprising isotopically labeled peptides to a first 
mass spectrometric analysis to identify a plurality of peptides whose masses 
overlap; 

subjecting the plurality of isotopically labeled isobaric peptides to a 
10 second mass spectrometric analysis to yield fragment ions, wherein at least two 

of the fragment ions are isotopically labeled and differ in mass with respect to 
each other; and 

determining the isotope ratio of the at least two fragment ions, wherein 
the isotope ratio is indicative of the relative quantities of the isobaric peptides in 
15 the sample. 

58. The method of claim 57 wherein labeled and unlabeled forms of at least one 
peptide are present in the second dimension of mass spectrometry. 

20 59. A method for analyzing a protein in a sample comprising a plurality of 

proteins, the method comprising the steps of: 

providing a sample comprising at least one protein comprising a 
signature peptide comprising an affinity ligand; 

fragmenting the proteins in the sample to produce a peptide pool; 
25 contacting peptides from at least a portion of the peptide pool with a 

capture moiety that selects for the affinity ligand to select peptides comprising 
the signature peptide, wherein the affinity ligand does not include an isotopic 
label; and 

analyzing at least a portion of the peptide pool by mass spectroscopy. 

30 
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60. The method of claim 59 wherein fragmenting the proteins comprises 
contacting the proteins with at least one of a chemical proteolytic agent, and 
enzymatic proteolytic agent and a mechanical proteolytic agent. 

5 61 . The method of claim 59 wherein the affinity ligand is endogenous to the 

signature peptides. 

62. The method of claim 61 wherein the affinity ligand comprises an 
antigen. 

10 

63. The method of claim 62 wherein the affinity ligand comprises an antigen 
selected from the group consisting of a sugar, a lipids, a glycolipid, and a 
peptide. 

1 5 64. The method of claim 59 wherein the affinity ligand comprises an 

exogenous affinity ligand. 

65. The method of claim 59 wherein the protein sample is reduced and 
alkylated prior to fragmentation with an alkylating agent. 

20 

66. The method of claim 64 wherein the alkyating agent comprises the 
affinity ligand. 

67. The method of claim 59 wherein the signature peptide comprises at least 
25 one low abundance amino acid selected from the group consisting of cysteine, 

tryptophan, histidine, methionine and tyrosine. 

68. The method of claim 59 wherein the signature peptide comprises at least 
one phosphate group. 

30 
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69. The method of claim 59 wherein the signature peptide comprises at least 
one oligosaccharide. 

70. The method of claim 59 further comprising fractionating the affmity- 
5 selected peptides prior to analysis. 

7 1 . The method of claim 59 further comprising fractionating the peptides in 
the peptide pool prior to contacting the peptides with the capture moiety. 

1 0 72. The method of claim 7 1 wherein the fractionation technique is selected 

from the group consisting of reversed phase chromatography, ion exchange 
chromatography, hydrophobic interaction chromatography, size exclusion 
chromatography, capillary gel electrophoresis, capillary zone electrophoresis 
and capillary electrochromatography, capillary isoelectric focusing, 

1 5 immobilized metal affinity chromatography and affinity electrophoresis. 

73. The method of claim 59 wherein the analyzing step comprises mass 
spectrometric analysis selected from the group consisting of matrix assisted 
laser desorption ionization (MALDI), electrospray ionization (ESI), fast atom 

20 bombardment (FAB), electron impact ionization, atmospheric pressure chemical 

ionization (APCI), time-of-flight (TOF), quadrapole, ion trap, magnetic sector, 
ion cyclotron resonance mass, or combinations thereof. 

74. A method for identifying a protein in a sample comprising a plurality of 
25 proteins, the method comprising: 

providing peptides derived from fragmentation of proteins in a sample 
comprising a plurality of proteins, wherein at least one peptide derived from the 
protein to be identified comprises at least one affinity ligand, wherein the 
affinity ligand does not include an isotope label; 
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contacting the peptides with a capture moiety to select peptides 
comprising the affinity ligand; 

fractionating the selected peptides to yield a plurality of peptide 
fractions; 

5 subj ecting the peptides in at least one peptide fraction to mass 

spectrometric analysis to detect at least one peptide derived from the protein to 
be identified; and 

identifying the protein from which the detected peptide was derived. 

10 75. The method of claim 74 wherein the detected peptide is a signature peptide 

of the protein to be identified, the method further comprising deter mi ni n g the 
mass of the signature peptide and using the mass of the signature peptide to 
identify the protein from which the detected peptide was derived. 

1 5 76. The method of claim 74 further comprising determining the amino acid 

sequence of the detected peptide and using the amino acid sequence of the 
detected peptide to identify the protein from which the detected peptide was 
derived. 

20 77. The method of claim 74 further comprising, prior to contacting the peptides 

with the capture moiety, covalently attaching at least one affinity ligand to at 
least one peptide derived from the fragmentation of the proteins. 

78. The method of claim 74 further comprising, prior to fragmenting the 

25 proteins, covalently attaching at least one afTinity ligand to at least one protein 

in the sample. 

79. The method of claim 74 further comprising reducing and alkylating the 
proteins with an alkylating agent prior to fragmenting the proteins. 

30 
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80. The method of claim 79 wherein the at least one affinity ligand is 
covalently attached to the alkylating agent. 

81 . The method of claim 74 wherein the at least one affinity ligand is 

5 covalently attached to an amino acid of the peptide selected from the group 

consisting of cysteine, tyrosine, tryptophan, histidine and methionine. 

82. The method of claim 74 wherein the affinity ligand comprises a moiety 
selected from the group consisting of a peptide antigen, a polyhistidine, a biotin, 

10 a dinitrophenol, an oligonucleotide and a peptide nucleic acid. 

83. The method of claim 74 wherein at least one peptide comprises an 
endogenous affinity ligand. 

15 84. The method of claim 83 wherein the endogenous affinity ligand comprises 

a phosphate group or a carbohydrate. 

85. The method of claim 84 wherein the endogenous affinity ligand comprises 
a phosphate group, and wherein contacting the peptides with a capture moiety 

20 comprises contacting the peptides at acidic pH with a cationic support surface. 

86. The method of claim 83 wherein the endogenous affinity ligand comprises a 
cysteine or a histidine. 

25 87. The method of claim 83 wherein the endogenous affinity ligand comprises 

an antigenic amino acid sequence. 

88. The method of claim 74 further comprising attaching a plurality of affinity 
ligands, each to at least one protein or peptide, and contacting the peptides with 
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a plurality of capture moieties to select peptides comprising at least one affinity 
ligand. 

89. The method of claim 74 further comprising fragmenting the proteins in the 
5 sample to yield the peptides. 

90. The method of claim 89 wherein the proteins are fragmented using an 
enzyme selected from the group consisting of trypsin, chymotrypsin, gluc-C, 
endo lys-C, pepsin, papain, proteinase K, carboxypeptidase, calpain and 

10 subtilisin. 

91 . The method of claim 74 wherein fractionating the selected peptides 
comprises subjecting the selected peptides to at least one separation technique 
selected from the group consisting of reversed phase chromatography, ion 

15 exchange chromatography, hydrophobic interaction chromatography, size 

exclusion chromatography, capillary gel electrophoresis, capillary zone 
electrophoresis and capillary electrochromatography, capillary isoelectric 
focusing, immobilized metal affinity chromatography and affinity 
electrophoresis. 

20 

92. The method of claim 74 wherein the sample comprises at least about 100 
proteins. 

93. The method of claim 74 wherein using the mass of the signature peptide to 
25 identify the protein from which the signature peptide was derived comprises 

comparing the mass of the signature peptide with the masses of reference 
peptides derived from putative proteolytic cleavage of a plurality of reference 
proteins in a database, wherein at least one reference peptide comprises at least 
one affinity ligand. 

30 
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94. The method of claim 93 wherein peptides derived from proteolytic cleavage 
of the plurality of reference proteins are, prior to comparing the mass of the 
signature peptide with the masses of the reference peptides, computationally 
selected to exclude reference peptides that do not contain an amino acid upon 

5 which the affinity selection is based. 

95. A method for identifying a protein in a sample comprising a plurality of 
proteins, the method comprising: 

providing peptides derived from fragmentation of proteins in a sample 
10 comprising a plurality of proteins, wherein at least one peptide comprises at 

least one affinity ligand; 

contacting the peptides with a capture moiety to select peptides 
comprising the at least one affinity ligand; 

determining the mass of at least one peptide comprising the at least one 
1 5 affinity ligand which is a signature peptide of the protein; and 

using the mass of the signature peptide to identify the protein from 
which the signature peptide was derived. 
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